Search Results: "rra"

16 September 2024

Russ Allbery: Review: The Wings Upon Her Back

Review: The Wings Upon Her Back, by Samantha Mills
Publisher: Tachyon
Copyright: 2024
ISBN: 1-61696-415-4
Format: Kindle
Pages: 394
The Wings Upon Her Back is a political steampunk science fantasy novel. If the author's name sounds familiar, it may be because Samantha Mills's short story "Rabbit Test" won Nebula, Locus, Hugo, and Sturgeon awards. This is her first novel. Winged Zemolai is a soldier of the mecha god and the protege of Mecha Vodaya, the Voice. She has served the city-state of Radezhda by defending it against all enemies, foreign and domestic, for twenty-six years. Despite that, it takes only a moment of errant mercy for her entire life to come crashing down. On a whim, she spares a kitchen worker who was concealing a statue of the scholar god, meaning that he was only pretending to worship the worker god like all workers should. Vodaya is unforgiving and uncompromising, as is the sleeping mecha god. Zemolai's wings are ripped from her back and crushed in the hand of the god, and she's left on the ground to die of mechalin withdrawal. The Wings Upon Her Back is told in two alternating timelines. The main one follows Zemolai after her exile as she is rescued by a young group of revolutionaries who think she may be useful in their plans. The other thread starts with Zemolai's childhood and shows the reader how she became Winged Zemolai: her scholar family, her obsession with flying, her true devotion to the mecha god, and the critical early years when she became Vodaya's protege. Mills maintains the separate timelines through the book and wraps them up in a rather neat piece of symbolic parallelism in the epilogue. I picked up this book on a recommendation from C.L. Clark, and yes, indeed, I can see why she liked this book. It's a story about a political awakening, in which Zemolai slowly realizes that she has been manipulated and lied to and that she may, in fact, be one of the baddies. The Wings Upon Her Back is more personal than some other books with that theme, since Zemolai was specifically (and abusively) groomed for her role by Vodaya. Much of the book is Zemolai trying to pull out the hooks that Vodaya put in her or, in the flashback timeline, the reader watching Vodaya install those hooks. The flashback timeline is difficult reading. I don't think Mills could have left it out, but she says in the afterword that it was the hardest part of the book to write and it was also the hardest part of the book to read. It fills in some interesting bits of world-building and backstory, and Mills does a great job pacing the story revelations so that both threads contribute equally, but mostly it's a story of manipulative abuse. We know from the main storyline that Vodaya's tactics work, which gives those scenes the feel of a slow-motion train wreck. You know what's going to happen, you know it will be bad, and yet you can't look away. It occurred to me while reading this that Emily Tesh's Some Desperate Glory told a similar type of story without the flashback structure, which eliminates the stifling feeling of inevitability. I don't think that would not have worked for this story. If you simply rearranged the chapters of The Wings Upon Her Back into a linear narrative, I would have bailed on the book. Watching Zemolai being manipulated would have been too depressing and awful for me to make it to the payoff without the forward-looking hope of the main timeline. It gave me new appreciation for the difficulty of what Tesh pulled off. Mills uses this interwoven structure well, though. At about 90% through this book I had no idea how it could end in the space remaining, but it reaches a surprising and satisfying conclusion. Mills uses a type of ending that normally bothers me, but she does it by handling the psychological impact so well that I couldn't help but admire it. I'm avoiding specifics because I think it worked better when I wasn't expecting it, but it ties beautifully into the thematic point of the book. I do have one structural objection, though. It's one of those problems I didn't notice while reading, but that started bothering me when I thought back through the story from a political lens. The Wings Upon Her Back is Zemolai's story, her redemption arc, and that means she drives the plot. The band of revolutionaries are great characters (particularly Galiana), but they're supporting characters. Zemolai is older, more experienced, and knows critical information they don't have, and she uses it to effectively take over. As setup for her character arc, I see why Mills did this. As political praxis, I have issues. There is a tendency in politics to believe that political skill is portable and repurposable. Converting opposing operatives to the cause is welcomed not only because they indicate added support, but also because they can use their political skill to help you win instead. To an extent this is not wrong, and is probably the most true of combat skills (which Zemolai has in abundance). But there's an underlying assumption that politics is symmetric, and a critical reason why I hold many of the political positions that I do hold is that I don't think politics is symmetric. If someone has been successfully stoking resentment and xenophobia in support of authoritarians, converts to an anti-authoritarian cause, and then produces propaganda stoking resentment and xenophobia against authoritarians, this is in some sense an improvement. But if one believes that resentment and xenophobia are inherently wrong, if one's politics are aimed at reducing the resentment and xenophobia in the world, then in a way this person has not truly converted. Worse, because this is an effective manipulation tactic, there is a strong tendency to put this type of political convert into a leadership position, where they will, intentionally or not, start turning the anti-authoritarian movement into a copy of the authoritarian movement they left. They haven't actually changed their politics because they haven't understood (or simply don't believe in) the fundamental asymmetry in the positions. It's the same criticism that I have of realpolitik: the ends do not justify the means because the means corrupt the ends. Nothing that happens in this book is as egregious as my example, but the more I thought about the plot structure, the more it bothered me that Zemolai never listens to the revolutionaries she joins long enough to wrestle with why she became an agent of an authoritarian state and they didn't. They got something fundamentally right that she got wrong, and perhaps that should have been reflected in who got to make future decisions. Zemolai made very poor choices and yet continues to be the sole main character of the story, the one whose decisions and actions truly matter. Maybe being wrong about everything should be disqualifying for being the main character, at least for a while, even if you think you've understood why you were wrong. That problem aside, I enjoyed this. Both timelines were compelling and quite difficult to put down, even when they got rather dark. I could have done with less body horror and a few fewer fight scenes, but I'm glad I read it. Science fiction readers should be warned that the world-building, despite having an intricate and fascinating surface, is mostly vibes. I started the book wondering how people with giant metal wings on their back can literally fly, and thought the mentions of neural ports, high-tech materials, and immune-suppressing drugs might mean that we'd get some sort of explanation. We do not: heavier-than-air flight works because it looks really cool and serves some thematic purposes. There are enough hints of technology indistinguishable from magic that you could make up your own explanations if you wanted to, but that's not something this book is interested in. There's not a thing wrong with that, but don't get caught by surprise if you were in the mood for a neat scientific explanation of apparent magic. Recommended if you like somewhat-harrowing character development with a heavy political lens and steampunk vibes, although it's not the sort of book that I'd press into the hands of everyone I know. The Wings Upon Her Back is a complete story in a single novel. Content warning: the main character is a victim of physical and emotional abuse, so some of that is a lot. Also surgical gore, some torture, and genocide. Rating: 7 out of 10

8 September 2024

Jacob Adams: Linux's Bedtime Routine

How does Linux move from an awake machine to a hibernating one? How does it then manage to restore all state? These questions led me to read way too much C in trying to figure out how this particular hardware/software boundary is navigated. This investigation will be split into a few parts, with the first one going from invocation of hibernation to synchronizing all filesystems to disk. This article has been written using Linux version 6.9.9, the source of which can be found in many places, but can be navigated easily through the Bootlin Elixir Cross-Referencer: https://elixir.bootlin.com/linux/v6.9.9/source Each code snippet will begin with a link to the above giving the file path and the line number of the beginning of the snippet.

A Starting Point for Investigation: /sys/power/state and /sys/power/disk These two system files exist to allow debugging of hibernation, and thus control the exact state used directly. Writing specific values to the state file controls the exact sleep mode used and disk controls the specific hibernation mode1. This is extremely handy as an entry point to understand how these systems work, since we can just follow what happens when they are written to.

Show and Store Functions These two files are defined using the power_attr macro: kernel/power/power.h:80
#define power_attr(_name) \
static struct kobj_attribute _name##_attr =     \
    .attr   =               \
        .name = __stringify(_name), \
        .mode = 0644,           \
     ,                  \
    .show   = _name##_show,         \
    .store  = _name##_store,        \
 
show is called on reads and store on writes. state_show is a little boring for our purposes, as it just prints all the available sleep states. kernel/power/main.c:657
/*
 * state - control system sleep states.
 *
 * show() returns available sleep state labels, which may be "mem", "standby",
 * "freeze" and "disk" (hibernation).
 * See Documentation/admin-guide/pm/sleep-states.rst for a description of
 * what they mean.
 *
 * store() accepts one of those strings, translates it into the proper
 * enumerated value, and initiates a suspend transition.
 */
static ssize_t state_show(struct kobject *kobj, struct kobj_attribute *attr,
			  char *buf)
 
	char *s = buf;
#ifdef CONFIG_SUSPEND
	suspend_state_t i;
	for (i = PM_SUSPEND_MIN; i < PM_SUSPEND_MAX; i++)
		if (pm_states[i])
			s += sprintf(s,"%s ", pm_states[i]);
#endif
	if (hibernation_available())
		s += sprintf(s, "disk ");
	if (s != buf)
		/* convert the last space to a newline */
		*(s-1) = '\n';
	return (s - buf);
 
state_store, however, provides our entry point. If the string disk is written to the state file, it calls hibernate(). This is our entry point. kernel/power/main.c:715
static ssize_t state_store(struct kobject *kobj, struct kobj_attribute *attr,
			   const char *buf, size_t n)
 
	suspend_state_t state;
	int error;
	error = pm_autosleep_lock();
	if (error)
		return error;
	if (pm_autosleep_state() > PM_SUSPEND_ON)  
		error = -EBUSY;
		goto out;
	 
	state = decode_state(buf, n);
	if (state < PM_SUSPEND_MAX)  
		if (state == PM_SUSPEND_MEM)
			state = mem_sleep_current;
		error = pm_suspend(state);
	  else if (state == PM_SUSPEND_MAX)  
		error = hibernate();
	  else  
		error = -EINVAL;
	 
 out:
	pm_autosleep_unlock();
	return error ? error : n;
 
kernel/power/main.c:688
static suspend_state_t decode_state(const char *buf, size_t n)
 
#ifdef CONFIG_SUSPEND
	suspend_state_t state;
#endif
	char *p;
	int len;
	p = memchr(buf, '\n', n);
	len = p ? p - buf : n;
	/* Check hibernation first. */
	if (len == 4 && str_has_prefix(buf, "disk"))
		return PM_SUSPEND_MAX;
#ifdef CONFIG_SUSPEND
	for (state = PM_SUSPEND_MIN; state < PM_SUSPEND_MAX; state++)  
		const char *label = pm_states[state];
		if (label && len == strlen(label) && !strncmp(buf, label, len))
			return state;
	 
#endif
	return PM_SUSPEND_ON;
 
Could we have figured this out just via function names? Sure, but this way we know for sure that nothing else is happening before this function is called.

Autosleep Our first detour is into the autosleep system. When checking the state above, you may notice that the kernel grabs the pm_autosleep_lock before checking the current state. autosleep is a mechanism originally from Android that sends the entire system to either suspend or hibernate whenever it is not actively working on anything. This is not enabled for most desktop configurations, since it s primarily for mobile systems and inverts the standard suspend and hibernate interactions. This system is implemented as a workqueue2 that checks the current number of wakeup events, processes and drivers that need to run3, and if there aren t any, then the system is put into the autosleep state, typically suspend. However, it could be hibernate if configured that way via /sys/power/autosleep in a similar manner to using /sys/power/state to manually enable hibernation. kernel/power/main.c:841
static ssize_t autosleep_store(struct kobject *kobj,
			       struct kobj_attribute *attr,
			       const char *buf, size_t n)
 
	suspend_state_t state = decode_state(buf, n);
	int error;
	if (state == PM_SUSPEND_ON
	    && strcmp(buf, "off") && strcmp(buf, "off\n"))
		return -EINVAL;
	if (state == PM_SUSPEND_MEM)
		state = mem_sleep_current;
	error = pm_autosleep_set_state(state);
	return error ? error : n;
 
power_attr(autosleep);
#endif /* CONFIG_PM_AUTOSLEEP */
kernel/power/autosleep.c:24
static DEFINE_MUTEX(autosleep_lock);
static struct wakeup_source *autosleep_ws;
static void try_to_suspend(struct work_struct *work)
 
	unsigned int initial_count, final_count;
	if (!pm_get_wakeup_count(&initial_count, true))
		goto out;
	mutex_lock(&autosleep_lock);
	if (!pm_save_wakeup_count(initial_count)  
		system_state != SYSTEM_RUNNING)  
		mutex_unlock(&autosleep_lock);
		goto out;
	 
	if (autosleep_state == PM_SUSPEND_ON)  
		mutex_unlock(&autosleep_lock);
		return;
	 
	if (autosleep_state >= PM_SUSPEND_MAX)
		hibernate();
	else
		pm_suspend(autosleep_state);
	mutex_unlock(&autosleep_lock);
	if (!pm_get_wakeup_count(&final_count, false))
		goto out;
	/*
	 * If the wakeup occurred for an unknown reason, wait to prevent the
	 * system from trying to suspend and waking up in a tight loop.
	 */
	if (final_count == initial_count)
		schedule_timeout_uninterruptible(HZ / 2);
 out:
	queue_up_suspend_work();
 
static DECLARE_WORK(suspend_work, try_to_suspend);
void queue_up_suspend_work(void)
 
	if (autosleep_state > PM_SUSPEND_ON)
		queue_work(autosleep_wq, &suspend_work);
 

The Steps of Hibernation

Hibernation Kernel Config It s important to note that most of the hibernate-specific functions below do nothing unless you ve defined CONFIG_HIBERNATION in your Kconfig4. As an example, hibernate itself is defined as the following if CONFIG_HIBERNATE is not set. include/linux/suspend.h:407
static inline int hibernate(void)   return -ENOSYS;  

Check if Hibernation is Available We begin by confirming that we actually can perform hibernation, via the hibernation_available function. kernel/power/hibernate.c:742
if (!hibernation_available())  
	pm_pr_dbg("Hibernation not available.\n");
	return -EPERM;
 
kernel/power/hibernate.c:92
bool hibernation_available(void)
 
	return nohibernate == 0 &&
		!security_locked_down(LOCKDOWN_HIBERNATION) &&
		!secretmem_active() && !cxl_mem_active();
 
nohibernate is controlled by the kernel command line, it s set via either nohibernate or hibernate=no. security_locked_down is a hook for Linux Security Modules to prevent hibernation. This is used to prevent hibernating to an unencrypted storage device, as specified in the manual page kernel_lockdown(7). Interestingly, either level of lockdown, integrity or confidentiality, locks down hibernation because with the ability to hibernate you can extract bascially anything from memory and even reboot into a modified kernel image. secretmem_active checks whether there is any active use of memfd_secret, and if so it prevents hibernation. memfd_secret returns a file descriptor that can be mapped into a process but is specifically unmapped from the kernel s memory space. Hibernating with memory that not even the kernel is supposed to access would expose that memory to whoever could access the hibernation image. This particular feature of secret memory was apparently controversial, though not as controversial as performance concerns around fragmentation when unmapping kernel memory (which did not end up being a real problem). cxl_mem_active just checks whether any CXL memory is active. A full explanation is provided in the commit introducing this check but there s also a shortened explanation from cxl_mem_probe that sets the relevant flag when initializing a CXL memory device. drivers/cxl/mem.c:186
* The kernel may be operating out of CXL memory on this device,
* there is no spec defined way to determine whether this device
* preserves contents over suspend, and there is no simple way
* to arrange for the suspend image to avoid CXL memory which
* would setup a circular dependency between PCI resume and save
* state restoration.

Check Compression The next check is for whether compression support is enabled, and if so whether the requested algorithm is enabled. kernel/power/hibernate.c:747
/*
 * Query for the compression algorithm support if compression is enabled.
 */
if (!nocompress)  
	strscpy(hib_comp_algo, hibernate_compressor, sizeof(hib_comp_algo));
	if (crypto_has_comp(hib_comp_algo, 0, 0) != 1)  
		pr_err("%s compression is not available\n", hib_comp_algo);
		return -EOPNOTSUPP;
	 
 
The nocompress flag is set via the hibernate command line parameter, setting hibernate=nocompress. If compression is enabled, then hibernate_compressor is copied to hib_comp_algo. This synchronizes the current requested compression setting (hibernate_compressor) with the current compression setting (hib_comp_algo). Both values are character arrays of size CRYPTO_MAX_ALG_NAME (128 in this kernel). kernel/power/hibernate.c:50
static char hibernate_compressor[CRYPTO_MAX_ALG_NAME] = CONFIG_HIBERNATION_DEF_COMP;
/*
 * Compression/decompression algorithm to be used while saving/loading
 * image to/from disk. This would later be used in 'kernel/power/swap.c'
 * to allocate comp streams.
 */
char hib_comp_algo[CRYPTO_MAX_ALG_NAME];
hibernate_compressor defaults to lzo if that algorithm is enabled, otherwise to lz4 if enabled5. It can be overwritten using the hibernate.compressor setting to either lzo or lz4. kernel/power/Kconfig:95
choice
	prompt "Default compressor"
	default HIBERNATION_COMP_LZO
	depends on HIBERNATION
config HIBERNATION_COMP_LZO
	bool "lzo"
	depends on CRYPTO_LZO
config HIBERNATION_COMP_LZ4
	bool "lz4"
	depends on CRYPTO_LZ4
endchoice
config HIBERNATION_DEF_COMP
	string
	default "lzo" if HIBERNATION_COMP_LZO
	default "lz4" if HIBERNATION_COMP_LZ4
	help
	  Default compressor to be used for hibernation.
kernel/power/hibernate.c:1425
static const char * const comp_alg_enabled[] =  
#if IS_ENABLED(CONFIG_CRYPTO_LZO)
	COMPRESSION_ALGO_LZO,
#endif
#if IS_ENABLED(CONFIG_CRYPTO_LZ4)
	COMPRESSION_ALGO_LZ4,
#endif
 ;
static int hibernate_compressor_param_set(const char *compressor,
		const struct kernel_param *kp)
 
	unsigned int sleep_flags;
	int index, ret;
	sleep_flags = lock_system_sleep();
	index = sysfs_match_string(comp_alg_enabled, compressor);
	if (index >= 0)  
		ret = param_set_copystring(comp_alg_enabled[index], kp);
		if (!ret)
			strscpy(hib_comp_algo, comp_alg_enabled[index],
				sizeof(hib_comp_algo));
	  else  
		ret = index;
	 
	unlock_system_sleep(sleep_flags);
	if (ret)
		pr_debug("Cannot set specified compressor %s\n",
			 compressor);
	return ret;
 
static const struct kernel_param_ops hibernate_compressor_param_ops =  
	.set    = hibernate_compressor_param_set,
	.get    = param_get_string,
 ;
static struct kparam_string hibernate_compressor_param_string =  
	.maxlen = sizeof(hibernate_compressor),
	.string = hibernate_compressor,
 ;
We then check whether the requested algorithm is supported via crypto_has_comp. If not, we bail out of the whole operation with EOPNOTSUPP. As part of crypto_has_comp we perform any needed initialization of the algorithm, loading kernel modules and running initialization code as needed6.

Grab Locks The next step is to grab the sleep and hibernation locks via lock_system_sleep and hibernate_acquire. kernel/power/hibernate.c:758
sleep_flags = lock_system_sleep();
/* The snapshot device should not be opened while we're running */
if (!hibernate_acquire())  
	error = -EBUSY;
	goto Unlock;
 
First, lock_system_sleep marks the current thread as not freezable, which will be important later7. It then grabs the system_transistion_mutex, which locks taking snapshots or modifying how they are taken, resuming from a hibernation image, entering any suspend state, or rebooting.

The GFP Mask The kernel also issues a warning if the gfp mask is changed via either pm_restore_gfp_mask or pm_restrict_gfp_mask without holding the system_transistion_mutex. GFP flags tell the kernel how it is permitted to handle a request for memory. include/linux/gfp_types.h:12
 * GFP flags are commonly used throughout Linux to indicate how memory
 * should be allocated.  The GFP acronym stands for get_free_pages(),
 * the underlying memory allocation function.  Not every GFP flag is
 * supported by every function which may allocate memory.
In the case of hibernation specifically we care about the IO and FS flags, which are reclaim operators, ways the system is permitted to attempt to free up memory in order to satisfy a specific request for memory. include/linux/gfp_types.h:176
 * Reclaim modifiers
 * -----------------
 * Please note that all the following flags are only applicable to sleepable
 * allocations (e.g. %GFP_NOWAIT and %GFP_ATOMIC will ignore them).
 *
 * %__GFP_IO can start physical IO.
 *
 * %__GFP_FS can call down to the low-level FS. Clearing the flag avoids the
 * allocator recursing into the filesystem which might already be holding
 * locks.
gfp_allowed_mask sets which flags are permitted to be set at the current time. As the comment below outlines, preventing these flags from being set avoids situations where the kernel needs to do I/O to allocate memory (e.g. read/writing swap8) but the devices it needs to read/write to/from are not currently available. kernel/power/main.c:24
/*
 * The following functions are used by the suspend/hibernate code to temporarily
 * change gfp_allowed_mask in order to avoid using I/O during memory allocations
 * while devices are suspended.  To avoid races with the suspend/hibernate code,
 * they should always be called with system_transition_mutex held
 * (gfp_allowed_mask also should only be modified with system_transition_mutex
 * held, unless the suspend/hibernate code is guaranteed not to run in parallel
 * with that modification).
 */
static gfp_t saved_gfp_mask;
void pm_restore_gfp_mask(void)
 
	WARN_ON(!mutex_is_locked(&system_transition_mutex));
	if (saved_gfp_mask)  
		gfp_allowed_mask = saved_gfp_mask;
		saved_gfp_mask = 0;
	 
 
void pm_restrict_gfp_mask(void)
 
	WARN_ON(!mutex_is_locked(&system_transition_mutex));
	WARN_ON(saved_gfp_mask);
	saved_gfp_mask = gfp_allowed_mask;
	gfp_allowed_mask &= ~(__GFP_IO   __GFP_FS);
 

Sleep Flags After grabbing the system_transition_mutex the kernel then returns and captures the previous state of the threads flags in sleep_flags. This is used later to remove PF_NOFREEZE if it wasn t previously set on the current thread. kernel/power/main.c:52
unsigned int lock_system_sleep(void)
 
	unsigned int flags = current->flags;
	current->flags  = PF_NOFREEZE;
	mutex_lock(&system_transition_mutex);
	return flags;
 
EXPORT_SYMBOL_GPL(lock_system_sleep);
include/linux/sched.h:1633
#define PF_NOFREEZE		0x00008000	/* This thread should not be frozen */
Then we grab the hibernate-specific semaphore to ensure no one can open a snapshot or resume from it while we perform hibernation. Additionally this lock is used to prevent hibernate_quiet_exec, which is used by the nvdimm driver to active its firmware with all processes and devices frozen, ensuring it is the only thing running at that time9. kernel/power/hibernate.c:82
bool hibernate_acquire(void)
 
	return atomic_add_unless(&hibernate_atomic, -1, 0);
 

Prepare Console The kernel next calls pm_prepare_console. This function only does anything if CONFIG_VT_CONSOLE_SLEEP has been set. This prepares the virtual terminal for a suspend state, switching away to a console used only for the suspend state if needed. kernel/power/console.c:130
void pm_prepare_console(void)
 
	if (!pm_vt_switch())
		return;
	orig_fgconsole = vt_move_to_console(SUSPEND_CONSOLE, 1);
	if (orig_fgconsole < 0)
		return;
	orig_kmsg = vt_kmsg_redirect(SUSPEND_CONSOLE);
	return;
 
The first thing is to check whether we actually need to switch the VT kernel/power/console.c:94
/*
 * There are three cases when a VT switch on suspend/resume are required:
 *   1) no driver has indicated a requirement one way or another, so preserve
 *      the old behavior
 *   2) console suspend is disabled, we want to see debug messages across
 *      suspend/resume
 *   3) any registered driver indicates it needs a VT switch
 *
 * If none of these conditions is present, meaning we have at least one driver
 * that doesn't need the switch, and none that do, we can avoid it to make
 * resume look a little prettier (and suspend too, but that's usually hidden,
 * e.g. when closing the lid on a laptop).
 */
static bool pm_vt_switch(void)
 
	struct pm_vt_switch *entry;
	bool ret = true;
	mutex_lock(&vt_switch_mutex);
	if (list_empty(&pm_vt_switch_list))
		goto out;
	if (!console_suspend_enabled)
		goto out;
	list_for_each_entry(entry, &pm_vt_switch_list, head)  
		if (entry->required)
			goto out;
	 
	ret = false;
out:
	mutex_unlock(&vt_switch_mutex);
	return ret;
 
There is an explanation of the conditions under which a switch is performed in the comment above the function, but we ll also walk through the steps here. Firstly we grab the vt_switch_mutex to ensure nothing will modify the list while we re looking at it. We then examine the pm_vt_switch_list. This list is used to indicate the drivers that require a switch during suspend. They register this requirement, or the lack thereof, via pm_vt_switch_required. kernel/power/console.c:31
/**
 * pm_vt_switch_required - indicate VT switch at suspend requirements
 * @dev: device
 * @required: if true, caller needs VT switch at suspend/resume time
 *
 * The different console drivers may or may not require VT switches across
 * suspend/resume, depending on how they handle restoring video state and
 * what may be running.
 *
 * Drivers can indicate support for switchless suspend/resume, which can
 * save time and flicker, by using this routine and passing 'false' as
 * the argument.  If any loaded driver needs VT switching, or the
 * no_console_suspend argument has been passed on the command line, VT
 * switches will occur.
 */
void pm_vt_switch_required(struct device *dev, bool required)
Next, we check console_suspend_enabled. This is set to false by the kernel parameter no_console_suspend, but defaults to true. Finally, if there are any entries in the pm_vt_switch_list, then we check to see if any of them require a VT switch. Only if none of these conditions apply, then we return false. If a VT switch is in fact required, then we move first the currently active virtual terminal/console10 (vt_move_to_console) and then the current location of kernel messages (vt_kmsg_redirect) to the SUSPEND_CONSOLE. The SUSPEND_CONSOLE is the last entry in the list of possible consoles, and appears to just be a black hole to throw away messages. kernel/power/console.c:16
#define SUSPEND_CONSOLE	(MAX_NR_CONSOLES-1)
Interestingly, these are separate functions because you can use TIOCL_SETKMSGREDIRECT (an ioctl11) to send kernel messages to a specific virtual terminal, but by default its the same as the currently active console. The locations of the previously active console and the previous kernel messages location are stored in orig_fgconsole and orig_kmsg, to restore the state of the console and kernel messages after the machine wakes up again. Interestingly, this means orig_fgconsole also ends up storing any errors, so has to be checked to ensure it s not less than zero before we try to do anything with the kernel messages on both suspend and resume. drivers/tty/vt/vt_ioctl.c:1268
/* Perform a kernel triggered VT switch for suspend/resume */
static int disable_vt_switch;
int vt_move_to_console(unsigned int vt, int alloc)
 
	int prev;
	console_lock();
	/* Graphics mode - up to X */
	if (disable_vt_switch)  
		console_unlock();
		return 0;
	 
	prev = fg_console;
	if (alloc && vc_allocate(vt))  
		/* we can't have a free VC for now. Too bad,
		 * we don't want to mess the screen for now. */
		console_unlock();
		return -ENOSPC;
	 
	if (set_console(vt))  
		/*
		 * We're unable to switch to the SUSPEND_CONSOLE.
		 * Let the calling function know so it can decide
		 * what to do.
		 */
		console_unlock();
		return -EIO;
	 
	console_unlock();
	if (vt_waitactive(vt + 1))  
		pr_debug("Suspend: Can't switch VCs.");
		return -EINTR;
	 
	return prev;
 
Unlike most other locking functions we ve seen so far, console_lock needs to be careful to ensure nothing else is panicking and needs to dump to the console before grabbing the semaphore for the console and setting a couple flags.

Panics Panics are tracked via an atomic integer set to the id of the processor currently panicking. kernel/printk/printk.c:2649
/**
 * console_lock - block the console subsystem from printing
 *
 * Acquires a lock which guarantees that no consoles will
 * be in or enter their write() callback.
 *
 * Can sleep, returns nothing.
 */
void console_lock(void)
 
	might_sleep();
	/* On panic, the console_lock must be left to the panic cpu. */
	while (other_cpu_in_panic())
		msleep(1000);
	down_console_sem();
	console_locked = 1;
	console_may_schedule = 1;
 
EXPORT_SYMBOL(console_lock);
kernel/printk/printk.c:362
/*
 * Return true if a panic is in progress on a remote CPU.
 *
 * On true, the local CPU should immediately release any printing resources
 * that may be needed by the panic CPU.
 */
bool other_cpu_in_panic(void)
 
	return (panic_in_progress() && !this_cpu_in_panic());
 
kernel/printk/printk.c:345
static bool panic_in_progress(void)
 
	return unlikely(atomic_read(&panic_cpu) != PANIC_CPU_INVALID);
 
kernel/printk/printk.c:350
/* Return true if a panic is in progress on the current CPU. */
bool this_cpu_in_panic(void)
 
	/*
	 * We can use raw_smp_processor_id() here because it is impossible for
	 * the task to be migrated to the panic_cpu, or away from it. If
	 * panic_cpu has already been set, and we're not currently executing on
	 * that CPU, then we never will be.
	 */
	return unlikely(atomic_read(&panic_cpu) == raw_smp_processor_id());
 
console_locked is a debug value, used to indicate that the lock should be held, and our first indication that this whole virtual terminal system is more complex than might initially be expected. kernel/printk/printk.c:373
/*
 * This is used for debugging the mess that is the VT code by
 * keeping track if we have the console semaphore held. It's
 * definitely not the perfect debug tool (we don't know if _WE_
 * hold it and are racing, but it helps tracking those weird code
 * paths in the console code where we end up in places I want
 * locked without the console semaphore held).
 */
static int console_locked;
console_may_schedule is used to see if we are permitted to sleep and schedule other work while we hold this lock. As we ll see later, the virtual terminal subsystem is not re-entrant, so there s all sorts of hacks in here to ensure we don t leave important code sections that can t be safely resumed.

Disable VT Switch As the comment below lays out, when another program is handling graphical display anyway, there s no need to do any of this, so the kernel provides a switch to turn the whole thing off. Interestingly, this appears to only be used by three drivers, so the specific hardware support required must not be particularly common.
drivers/gpu/drm/omapdrm/dss
drivers/video/fbdev/geode
drivers/video/fbdev/omap2
drivers/tty/vt/vt_ioctl.c:1308
/*
 * Normally during a suspend, we allocate a new console and switch to it.
 * When we resume, we switch back to the original console.  This switch
 * can be slow, so on systems where the framebuffer can handle restoration
 * of video registers anyways, there's little point in doing the console
 * switch.  This function allows you to disable it by passing it '0'.
 */
void pm_set_vt_switch(int do_switch)
 
	console_lock();
	disable_vt_switch = !do_switch;
	console_unlock();
 
EXPORT_SYMBOL(pm_set_vt_switch);
The rest of the vt_switch_console function is pretty normal, however, simply allocating space if needed to create the requested virtual terminal and then setting the current virtual terminal via set_console.

Virtual Terminal Set Console With set_console, we begin (as if we haven t been already) to enter the madness that is the virtual terminal subsystem. As mentioned previously, modifications to its state must be made very carefully, as other stuff happening at the same time could create complete messes. All this to say, calling set_console does not actually perform any work to change the state of the current console. Instead it indicates what changes it wants and then schedules that work. drivers/tty/vt/vt.c:3153
int set_console(int nr)
 
	struct vc_data *vc = vc_cons[fg_console].d;
	if (!vc_cons_allocated(nr)   vt_dont_switch  
		(vc->vt_mode.mode == VT_AUTO && vc->vc_mode == KD_GRAPHICS))  
		/*
		 * Console switch will fail in console_callback() or
		 * change_console() so there is no point scheduling
		 * the callback
		 *
		 * Existing set_console() users don't check the return
		 * value so this shouldn't break anything
		 */
		return -EINVAL;
	 
	want_console = nr;
	schedule_console_callback();
	return 0;
 
The check for vc->vc_mode == KD_GRAPHICS is where most end-user graphical desktops will bail out of this change, as they re in graphics mode and don t need to switch away to the suspend console. vt_dont_switch is a flag used by the ioctls11 VT_LOCKSWITCH and VT_UNLOCKSWITCH to prevent the system from switching virtual terminal devices when the user has explicitly locked it. VT_AUTO is a flag indicating that automatic virtual terminal switching is enabled12, and thus deliberate switching to a suspend terminal is not required. However, if you do run your machine from a virtual terminal, then we indicate to the system that we want to change to the requested virtual terminal via the want_console variable and schedule a callback via schedule_console_callback. drivers/tty/vt/vt.c:315
void schedule_console_callback(void)
 
	schedule_work(&console_work);
 
console_work is a workqueue2 that will execute the given task asynchronously.

Console Callback drivers/tty/vt/vt.c:3109
/*
 * This is the console switching callback.
 *
 * Doing console switching in a process context allows
 * us to do the switches asynchronously (needed when we want
 * to switch due to a keyboard interrupt).  Synchronization
 * with other console code and prevention of re-entrancy is
 * ensured with console_lock.
 */
static void console_callback(struct work_struct *ignored)
 
	console_lock();
	if (want_console >= 0)  
		if (want_console != fg_console &&
		    vc_cons_allocated(want_console))  
			hide_cursor(vc_cons[fg_console].d);
			change_console(vc_cons[want_console].d);
			/* we only changed when the console had already
			   been allocated - a new console is not created
			   in an interrupt routine */
		 
		want_console = -1;
	 
...
console_callback first looks to see if there is a console change wanted via want_console and then changes to it if it s not the current console and has been allocated already. We do first remove any cursor state with hide_cursor. drivers/tty/vt/vt.c:841
static void hide_cursor(struct vc_data *vc)
 
	if (vc_is_sel(vc))
		clear_selection();
	vc->vc_sw->con_cursor(vc, false);
	hide_softcursor(vc);
 
A full dive into the tty driver is a task for another time, but this should give a general sense of how this system interacts with hibernation.

Notify Power Management Call Chain kernel/power/hibernate.c:767
pm_notifier_call_chain_robust(PM_HIBERNATION_PREPARE, PM_POST_HIBERNATION)
This will call a chain of power management callbacks, passing first PM_HIBERNATION_PREPARE and then PM_POST_HIBERNATION on startup or on error with another callback. kernel/power/main.c:98
int pm_notifier_call_chain_robust(unsigned long val_up, unsigned long val_down)
 
	int ret;
	ret = blocking_notifier_call_chain_robust(&pm_chain_head, val_up, val_down, NULL);
	return notifier_to_errno(ret);
 
The power management notifier is a blocking notifier chain, which means it has the following properties. include/linux/notifier.h:23
 *	Blocking notifier chains: Chain callbacks run in process context.
 *		Callouts are allowed to block.
The callback chain is a linked list with each entry containing a priority and a function to call. The function technically takes in a data value, but it is always NULL for the power management chain. include/linux/notifier.h:49
struct notifier_block;
typedef	int (*notifier_fn_t)(struct notifier_block *nb,
			unsigned long action, void *data);
struct notifier_block  
	notifier_fn_t notifier_call;
	struct notifier_block __rcu *next;
	int priority;
 ;
The head of the linked list is protected by a read-write semaphore. include/linux/notifier.h:65
struct blocking_notifier_head  
	struct rw_semaphore rwsem;
	struct notifier_block __rcu *head;
 ;
Because it is prioritized, appending to the list requires walking it until an item with lower13 priority is found to insert the current item before. kernel/notifier.c:252
/*
 *	Blocking notifier chain routines.  All access to the chain is
 *	synchronized by an rwsem.
 */
static int __blocking_notifier_chain_register(struct blocking_notifier_head *nh,
					      struct notifier_block *n,
					      bool unique_priority)
 
	int ret;
	/*
	 * This code gets used during boot-up, when task switching is
	 * not yet working and interrupts must remain disabled.  At
	 * such times we must not call down_write().
	 */
	if (unlikely(system_state == SYSTEM_BOOTING))
		return notifier_chain_register(&nh->head, n, unique_priority);
	down_write(&nh->rwsem);
	ret = notifier_chain_register(&nh->head, n, unique_priority);
	up_write(&nh->rwsem);
	return ret;
 
kernel/notifier.c:20
/*
 *	Notifier chain core routines.  The exported routines below
 *	are layered on top of these, with appropriate locking added.
 */
static int notifier_chain_register(struct notifier_block **nl,
				   struct notifier_block *n,
				   bool unique_priority)
 
	while ((*nl) != NULL)  
		if (unlikely((*nl) == n))  
			WARN(1, "notifier callback %ps already registered",
			     n->notifier_call);
			return -EEXIST;
		 
		if (n->priority > (*nl)->priority)
			break;
		if (n->priority == (*nl)->priority && unique_priority)
			return -EBUSY;
		nl = &((*nl)->next);
	 
	n->next = *nl;
	rcu_assign_pointer(*nl, n);
	trace_notifier_register((void *)n->notifier_call);
	return 0;
 
Each callback can return one of a series of options. include/linux/notifier.h:18
#define NOTIFY_DONE		0x0000		/* Don't care */
#define NOTIFY_OK		0x0001		/* Suits me */
#define NOTIFY_STOP_MASK	0x8000		/* Don't call further */
#define NOTIFY_BAD		(NOTIFY_STOP_MASK 0x0002)
						/* Bad/Veto action */
When notifying the chain, if a function returns STOP or BAD then the previous parts of the chain are called again with PM_POST_HIBERNATION14 and an error is returned. kernel/notifier.c:107
/**
 * notifier_call_chain_robust - Inform the registered notifiers about an event
 *                              and rollback on error.
 * @nl:		Pointer to head of the blocking notifier chain
 * @val_up:	Value passed unmodified to the notifier function
 * @val_down:	Value passed unmodified to the notifier function when recovering
 *              from an error on @val_up
 * @v:		Pointer passed unmodified to the notifier function
 *
 * NOTE:	It is important the @nl chain doesn't change between the two
 *		invocations of notifier_call_chain() such that we visit the
 *		exact same notifier callbacks; this rules out any RCU usage.
 *
 * Return:	the return value of the @val_up call.
 */
static int notifier_call_chain_robust(struct notifier_block **nl,
				     unsigned long val_up, unsigned long val_down,
				     void *v)
 
	int ret, nr = 0;
	ret = notifier_call_chain(nl, val_up, v, -1, &nr);
	if (ret & NOTIFY_STOP_MASK)
		notifier_call_chain(nl, val_down, v, nr-1, NULL);
	return ret;
 
Each of these callbacks tends to be quite driver-specific, so we ll cease discussion of this here.

Sync Filesystems The next step is to ensure all filesystems have been synchronized to disk. This is performed via a simple helper function that times how long the full synchronize operation, ksys_sync takes. kernel/power/main.c:69
void ksys_sync_helper(void)
 
	ktime_t start;
	long elapsed_msecs;
	start = ktime_get();
	ksys_sync();
	elapsed_msecs = ktime_to_ms(ktime_sub(ktime_get(), start));
	pr_info("Filesystems sync: %ld.%03ld seconds\n",
		elapsed_msecs / MSEC_PER_SEC, elapsed_msecs % MSEC_PER_SEC);
 
EXPORT_SYMBOL_GPL(ksys_sync_helper);
ksys_sync wakes and instructs a set of flusher threads to write out every filesystem, first their inodes15, then the full filesystem, and then finally all block devices, to ensure all pages are written out to disk. fs/sync.c:87
/*
 * Sync everything. We start by waking flusher threads so that most of
 * writeback runs on all devices in parallel. Then we sync all inodes reliably
 * which effectively also waits for all flusher threads to finish doing
 * writeback. At this point all data is on disk so metadata should be stable
 * and we tell filesystems to sync their metadata via ->sync_fs() calls.
 * Finally, we writeout all block devices because some filesystems (e.g. ext2)
 * just write metadata (such as inodes or bitmaps) to block device page cache
 * and do not sync it on their own in ->sync_fs().
 */
void ksys_sync(void)
 
	int nowait = 0, wait = 1;
	wakeup_flusher_threads(WB_REASON_SYNC);
	iterate_supers(sync_inodes_one_sb, NULL);
	iterate_supers(sync_fs_one_sb, &nowait);
	iterate_supers(sync_fs_one_sb, &wait);
	sync_bdevs(false);
	sync_bdevs(true);
	if (unlikely(laptop_mode))
		laptop_sync_completion();
 
It follows an interesting pattern of using iterate_supers to run both sync_inodes_one_sb and then sync_fs_one_sb on each known filesystem16. It also calls both sync_fs_one_sb and sync_bdevs twice, first without waiting for any operations to complete and then again waiting for completion17. When laptop_mode is enabled the system runs additional filesystem synchronization operations after the specified delay without any writes. mm/page-writeback.c:111
/*
 * Flag that puts the machine in "laptop mode". Doubles as a timeout in jiffies:
 * a full sync is triggered after this time elapses without any disk activity.
 */
int laptop_mode;
EXPORT_SYMBOL(laptop_mode);
However, when running a filesystem synchronization operation, the system will add an additional timer to schedule more writes after the laptop_mode delay. We don t want the state of the system to change at all while performing hibernation, so we cancel those timers. mm/page-writeback.c:2198
/*
 * We're in laptop mode and we've just synced. The sync's writes will have
 * caused another writeback to be scheduled by laptop_io_completion.
 * Nothing needs to be written back anymore, so we unschedule the writeback.
 */
void laptop_sync_completion(void)
 
	struct backing_dev_info *bdi;
	rcu_read_lock();
	list_for_each_entry_rcu(bdi, &bdi_list, bdi_list)
		del_timer(&bdi->laptop_mode_wb_timer);
	rcu_read_unlock();
 
As a side note, the ksys_sync function is simply called when the system call sync is used. fs/sync.c:111
SYSCALL_DEFINE0(sync)
 
	ksys_sync();
	return 0;
 

The End of Preparation With that the system has finished preparations for hibernation. This is a somewhat arbitrary cutoff, but next the system will begin a full freeze of userspace to then dump memory out to an image and finally to perform hibernation. All this will be covered in future articles!
  1. Hibernation modes are outside of scope for this article, see the previous article for a high-level description of the different types of hibernation.
  2. Workqueues are a mechanism for running asynchronous tasks. A full description of them is a task for another time, but the kernel documentation on them is available here: https://www.kernel.org/doc/html/v6.9/core-api/workqueue.html 2
  3. This is a bit of an oversimplification, but since this isn t the main focus of this article this description has been kept to a higher level.
  4. Kconfig is Linux s build configuration system that sets many different macros to enable/disable various features.
  5. Kconfig defaults to the first default found
  6. Including checking whether the algorithm is larval? Which appears to indicate that it requires additional setup, but is an interesting choice of name for such a state.
  7. Specifically when we get to process freezing, which we ll get to in the next article in this series.
  8. Swap space is outside the scope of this article, but in short it is a buffer on disk that the kernel uses to store memory not current in use to free up space for other things. See Swap Management for more details.
  9. The code for this is lengthy and tangential, thus it has not been included here. If you re curious about the details of this, see kernel/power/hibernate.c:858 for the details of hibernate_quiet_exec, and drivers/nvdimm/core.c:451 for how it is used in nvdimm.
  10. Annoyingly this code appears to use the terms console and virtual terminal interchangeably.
  11. ioctls are special device-specific I/O operations that permit performing actions outside of the standard file interactions of read/write/seek/etc. 2
  12. I m not entirely clear on how this flag works, this subsystem is particularly complex.
  13. In this case a higher number is higher priority.
  14. Or whatever the caller passes as val_down, but in this case we re specifically looking at how this is used in hibernation.
  15. An inode refers to a particular file or directory within the filesystem. See Wikipedia for more details.
  16. Each active filesystem is registed with the kernel through a structure known as a superblock, which contains references to all the inodes contained within the filesystem, as well as function pointers to perform the various required operations, like sync.
  17. I m including minimal code in this section, as I m not looking to deep dive into the filesystem code at this time.

4 September 2024

Reproducible Builds: Reproducible Builds in August 2024

Welcome to the August 2024 report from the Reproducible Builds project! Our reports attempt to outline what we ve been up to over the past month, highlighting news items from elsewhere in tech where they are related. As ever, if you are interested in contributing to the project, please visit our Contribute page on our website. Table of contents:
  1. LWN: The history, status, and plans for reproducible builds
  2. Intermediate Autotools build artifacts removed from PostgreSQL distribution tarballs
  3. Distribution news
  4. Mailing list news
  5. diffoscope
  6. Website updates
  7. Upstream patches
  8. Reproducibility testing framework

LWN: The history, status, and plans for reproducible builds The free software newspaper of record, Linux Weekly News, published an in-depth article based on Holger Levsen s talk, Reproducible Builds: The First Eleven Years which was presented at the recent DebConf24 conference in Busan, South Korea. Titled The history, status, and plans for reproducible builds and written by Jake Edge, LWN s article not only summarises Holger s talk and clarifies its message but it links to external information as well. Holger s original talk can also be watched on the DebConf24 webpage (direct .webm link and his HTML slides are available also). There are also a significant number of comments on LWN s page as well. Holger Levsen also headed a scheduled discussion session at DebConf24 on Preserving *other* build artifacts addressing a topic where a number of Debian packages are (or would like to) produce results that are neither the .deb files, the build logs nor the logs of CI tests. This is an issue for reproducible builds as this 4th type of build artifact are typically shipped within the binary .deb packages, and are invariably non-deterministic; thus making the .deb files unreproducible. (A direct .webm link and HTML slides are available).

Intermediate Autotools build artifacts removed from PostgreSQL distribution tarballs Peter Eisentraut wrote a detailed blog post on the subject of The new PostgreSQL 17 make dist . Like many projects, the PostgreSQL database has previously pre-built parts of its GNU Autotools build system: the reason for this is a mix of convenience and traditional practice . Peter astutely notes that this arrangement in the build system is quite tricky as:
You need to carefully maintain the different states of clean source code , partially built source code , and fully built source code , and the commands to transition between them.
However, Peter goes on to mention that:
a lot more attention is nowadays paid to the software supply chain. There are security and legal reasons for this. When users install software, they want to know where it came from, and they want to be sure that they got the right thing, not some fake version or some version of dubious legal provenance.
And cites the XZ Utils backdoor as a reason to care about transparent and reproducible ways of distributing and communicating a source tarball and provenance. Because of this, intermediate build artifacts are now henceforth essentially disallowed from PostgreSQL distribution tarballs.

Distribution news In Debian this month, 30 reviews of Debian packages were added, 17 were updated and 10 were removed this month adding to our knowledge about identified issues. One issue type was added by Chris Lamb, too. [ ] In addition, an issue was filed to update the Salsa CI pipeline (used by 1,000s of Debian packages) to no longer test for reproducibility with reprotest s build_path variation. Holger Levsen provided a rationale for this change in the issue, which has already been made to the tests being performed by tests.reproducible-builds.org.
In Arch Linux this month, Jelle van der Waa published a short blog post on the topic of Investigating creating reproducible images with mkosi, motivated by the desire to make it possible for anyone to re-recreate the official Arch cloud image bit-by-bit identical on their own machine as per [the] reproducible builds definition. In addition, Jelle filed a patch for pacman, the Arch Linux package manager, to respect the SOURCE_DATE_EPOCH environment variable when installing a package.
In openSUSE news, Bernhard M. Wiedemann published another report for that distribution.
In Android news, the IzzyOnDroid project added 49 new rebuilder recipes and now features 256 total reproducible applications representing 21% of the total offerings in the repository. IzzyOnDroid is an F-Droid style repository for Android apps[:] applications in this repository are official binaries built by the original application developers, taken from their resp. repositories (mostly GitHub).

Mailing list news From our mailing list this month:
  • Bernhard M. Wiedemann posted a brief message to the list with some helpful information regarding nondeterminism within Rust binaries, positing the use of the codegen-units = 16 default and resulting in a bug being filed in the Rust issue tracker. [ ]
  • Bernhard also wrote to the list, following up to a thread in November 2023, on attempts to make the LibreOffice suite of office applications build reproducibly. In the thread from this month, Bernhard could announce that the four patches previously mentioned have landed in LibreOffice upstream.
  • Fay Stegerman linked the mailing list to a thread she made on the Signal issue tracker regarding whether device-specific binaries [can] ever be considered meaningfully reproducible . In particular: the whole part about allow[ing] multiple third parties to come to a consensus on a correct result breaks down completely when correct is device-specific and not something everyone can agree on. [ ]
  • Developer kpcyrd posted an update for source code indexing project, whatsrc.org. Announcing that it now importing packages from live-bootstrap ( a usable Linux system [that is] created with only human-auditable, and wherever possible, human-written, source code ) into its database of provenance data.
  • Lastly, Mechtilde Stehmann posted an update to an earlier thread about how Java builds are not reproducible on the armhf architecture, enquiring how they might gain temporary access to such a machine in order to perform some deeper testing. [ ]

diffoscope diffoscope is our in-depth and content-aware diff utility that can locate and diagnose reproducibility issues. This month, Chris Lamb released versions 274, 275, 276 and 277, uploaded these to Debian, and made the following changes as well:
  • New features:
    • Strip ANSI escapes usually colour codes from the output of the Procyon Java decompiler. [ ]
    • Factor out a method for stripping ANSI escapes. [ ]
    • Append output from dumppdf(1) in more cases, avoiding situations where we fallback to a binary diff. [ ]
    • Add support for versions of Perl s IO::Compress::Zip version 2.212. [ ]
  • Bug fixes:
    • Also catch RuntimeError exceptions when importing the PyPDF library so that it, or, crucially, its transitive dependencies, cannot not cause diffoscope to traceback at runtime and build time. [ ]
    • Do not call marshal.load( ) of precompiled Python bytecode as it, alas, inherently unsafe. Replace for now with a brief summary of the code section of .pyc. [ ][ ]
    • Don t include excessive debug output when calling dumppdf(1). [ ]
  • Testsuite-related changes:
    • Don t bother to check version number in test_python.py: the fixture for this test is fixed. [ ][ ]
    • Update test_zip text fixtures and definitions to support new changes to the Perl IO::Compress library. [ ]
In addition, Mattia Rizzolo updated the available architectures for a number of test dependencies [ ] and Sergei Trofimovich fixed an issue to avoid diffoscope crashing when hashing directory symlinks [ ] and Vagrant Cascadian proposed GNU Guix updates for diffoscope versions [275 and 276 and [277.

Website updates There were a rather substantial number of improvements made to our website this month, including:
  • Alba Herrerias:
    • Substantially extend the guidance on the Contribute page. [ ]
  • Chris Lamb:
    • Set the future: true configuration value so we render all files and documents in the website, regardless of whether they have a date property in the future. After all, we don t re-generate the website on a timer, and have other ways of making unpublished, draft posts. [ ][ ]
  • Fay Stegerman:
  • hulkoba:
  • kpcyrd:
  • Mattia Rizzolo:
  • Pol Dellaiera:

Upstream patches The Reproducible Builds project detects, dissects and attempts to fix as many currently-unreproducible packages as possible. We endeavour to send all of our patches upstream where appropriate. This month, we wrote a large number of such patches, including:

Reproducibility testing framework The Reproducible Builds project operates a comprehensive testing framework running primarily at tests.reproducible-builds.org in order to check packages and other artifacts for reproducibility. In August, a number of changes were made by Holger Levsen, including:
  • Temporarily install the openssl-provider-legacy package for the Debian unstable environments for running diffoscope due to Debian bug #1078944. [ ][ ][ ][ ]
  • Mark Debian armhf architecture nodes as being down due to proxy down. [ ][ ]
  • Detect proxy failures. [ ][ ][ ]
  • Run the index-buildinfo for the builtin-pho script with the -q switch. [ ]
  • Disable all Arch Linux reproducible jobs. [ ]
In addition, Mattia Rizzolo updated the website configuration to install the ruby-jekyll-sitemap package as it is now used in the website [ ], Roland Clobus updated the script to build Debian live images to treat openQA issues as warnings [ ], and Vagrant Cascadian marked the cbxi4b node as down [ ].

If you are interested in contributing to the Reproducible Builds project, please visit our Contribute page on our website. However, you can get in touch with us via:

Jonathan Dowland: loading (unintended consequences?)

For their 30th anniversary (ish; the Covid pandemic pushed the date out a bit) British electronic music duo Orbital released the compilation 30 something. The track list mostly looks like a best hits list, which given their prior compilation celebrating 20 years looks much the same would appear superfluous. However, they ve rearranged and re-recorded all their songs for 30, to reflect their live arrangements. The reworkings are sufficiently distinct from the original versions (in some cases I prefer them) and elevate the release. The couple of new tracks are also fun, and many of the remixes on the second disc are worth a listen too.
cover art from Orbital - 30 Something
But what I actually sat down to write about was the cover artwork. They often have designs which riff on the notion of a circle (given their name) and the 30-something art (both for the album and single takes from it) adapts a loading spinner-like device from computing (I suppose it mostly closely resembles the spinner from macOS). A possibly unintended effect of the pattern occurs when you view it on a display which is adjusting its brightness, such as if you re listening to it on a phone, the screen is off, and you pick it up. The brightest part of the spinner is visible first, and the rest fade into visibility in sequence. The first time you see this is unexpected and very cool. (I've tried to recreate it in the picture below, but I don't think it's worked.) Although I've suffixed the titled of this post unintended consequences?, It's quite possible this was deliberate.
screenshot of the artwork displayed on my phone
I ve got the pattern on a t-shirt and my kids love to call out Daddy s loading! In my convalescence it s taken on a special sort of resonance because at times I ve felt I m in a holding state: waiting for an appointment to be made; waiting a polite interval before chasing an appointment; waiting for treatment to start after attending an appointment. Thankfully I m at the end of that now, I hope.

31 August 2024

Russ Allbery: Review: The Shepherd's Crown

Review: The Shepherd's Crown, by Terry Pratchett
Series: Discworld #41
Publisher: Harper
Copyright: 2015
Printing: 2016
ISBN: 0-06-242998-1
Format: Trade paperback
Pages: 276
The Shepherd's Crown is the 41st and final Discworld novel and the 5th and final Tiffany Aching novel. You should not start here. There is a pretty major character event in the second chapter of this book. I'm not going to say directly what it is, but you will likely be able to guess from the rest of the review. If you're particularly adverse to spoilers, you may want to skip reading this until you've read the book. Tiffany Aching is extremely busy. Witches are responsible for all the little tasks that fall between the cracks, and there are a lot of cracks. The better she gets at her job, the more of the job there seems to be.
"Well," said Tiffany, "there's too much to be done and not enough people to do it." The smile that the kelda gave her was a strange one. The little woman said, "Do ye let them try? Ye mustn't be afraid to ask for help. Pride is a good thing, my girl, but it will kill you in time."
And that's before an earth-shattering change in the world of witches, one that leaves Tiffany shuttling between Lancre and the Chalk trying to be too many things to too many people. Plus the kelda is worried some deeper trouble is brewing. And then Tiffany gets an exiled elven queen who has never understood the worth of other people dumped on her, and has to figure out what to do with her. The starting idea is great. I continue to be impressed with how well Pratchett handles Tiffany's coming-of-age story. Finding one's place in the world isn't one lesson or event; it's layers of them, with each new growth in responsibility uncovering new things to learn that are often quite different from the previous problems. Tiffany has worked through child problems, adolescent problems, and new adulthood problems. Now she's on a course towards burnout, which is exactly the kind of problem Tiffany would have given her personality. Even better, the writing at the start of The Shepherd's Crown is tight and controlled and sounds like Pratchett, which was a relief after the mess of Raising Steam. The contrast is so sharp that I found myself wondering if parts of this book had been written earlier, or if Pratchett found a new writing or editing method. The characters all sound like themselves, and although some of the turns of phrase are not quite as sharp as in earlier books, they're at least at the level of Snuff. Unfortunately, it doesn't last. There are some great moments and some good quotes, but the writing starts to slip at about the two-thirds point, the sentences began to meander, the characters start repeating the name of the person they're talking to, and the narration becomes increasingly strained. It felt like Pratchett knew the emotional tone he wanted to evoke but couldn't find a subtle way to express it, so the story and the characters start to bludgeon the reader with Grand Statements. It's never as bad as Raising Steam, but it doesn't slip smoothly off the page to rewrite your brain the way that Pratchett could at his best. What makes this worse is that the plot is not very interesting. I wanted to read a book about Tiffany understanding burnout, asking for help, and possibly also about mental load and how difficult delegation is. There is some movement in that direction: she takes on some apprentices, although we don't see as much of her interactions with them as I'd like, and there's an intriguing new male character who wants to be a witch. I wish Pratchett had been able to give Geoffrey his own book. He and his goat were the best part of the story, but it felt rushed and I think he would have had more impact if the reader got to see him develop his skills over time the way that we did with Tiffany. But, alas, all of that is side story to the main plot, which is about elves. As you may know from previous reviews, I do not get along with Pratchett's conception of elves. I find them boring and too obviously evil, and have since Lords and Ladies. Villains have never been one of Pratchett's strengths, and I think his elves are my least favorite. One of the goals of this book is to try to make them less one-note by having Tiffany try to teach one of them empathy, but I didn't find any of the queen's story arc convincing. If Pratchett had pulled those threads together with something more subtle, emotional, and subversive, I think it could have worked, but instead we got another battle royale, and Lords and Ladies did that better.
"Granny never said as she was better than others. She just got on with it and showed 'em and people worked it out for themselves."
And so we come to the end. I wish I could say that the quality held up through the whole series, and it nearly did, but alas it fell apart a bit at the end. Raising Steam I would skip entirely. The Shepherd's Crown is not that bad, but it's minor Pratchett that's worth reading mainly because it's the send-off (and there are a lot of reasons within the story to think Pratchett knew that when writing it). There are a few great lines, some catharsis, and a pretty solid ending for Tiffany, but it's probably not a book that I'll re-read. Content warning: major character death. Special thanks to Emmet Asher-Perrin, whose Tor.com/Reactor re-read of all of Discworld got me to pick the series up again and finally commit to reading all of it. I'm very glad I did. Rating: 6 out of 10

22 August 2024

Jonathan McDowell: Thoughts on Advent of Code + Rust

Diego wrote about his dislike for Advent of Code and that reminded me I hadn t written up my experience from 2023. Mostly because, spoiler, I never actually completed it and always intended to do so and then write it up. I think it s time to accept I m not going to do that, and write down some thoughts before I forget all of them. These are somewhat vague, given the time that s elapsed, but I think still relevant. You might also find Roger s problem write up interesting. I ve tried AoC a couple of times before; I think I had a very brief attempt back in 2021, and I got 4 days in for 2022. For Advent of Code 2023 I tried much harder to actually complete the challenges, and got most of the way there. I didn t allow myself to move on to the next day until fully completing the previous day, and didn t end up doing the second half of December 24th, or any of December 25th.

Rust First I want to talk about Rust, which is the language I chose to use for the problems. I ve dabbled a little in it, but I d like more familiarity with the basic language, and some programming problems seemed like a good way to get that. It s a language I want to like; I ve spent a lot of my career writing C, do more in Go these days, and generally think Rust promises a low level, run-time light environment like C but with the rough edges taken off. I set myself the challenge of using just bare Rust; no external crates, no use of cargo. I was accused of playing on hard mode by doing this, but it really wasn t the intention - I figured that I should be able to do what I needed without recourse to anything outside the core language, and didn t want what seemed like the extra complexity of dealing with cargo. That caused problems, however. I m used to by-default generic error handling in Go through the error type, but Rust seems to have much more tightly typed errors. I was pointed at anyhow as the right way to do this in Rust. I still find this surprising; I ended up using unwrap() a lot when I think with more generic error handling I could have used ?. The other thing I discovered is that by default rustc is heavy on the debug output. I got significantly better results on some of the solutions with rustc -O -C target-cpu=native source.rs. I probably shouldn t be surprised by this, but worth noting. Rust, to me, has a syntax only a C++ programmer could love. I am not a C++ programmer. Coming from C I found Go to be a nice, simple syntax to learn. Rust has not been the same. There s a lot more punctuation, and it s not always clear to me what it s doing. This applies more when reading other people s code than when writing it myself, obviously, but I see a lot of Rust code that could give Perl a run for its money in terms of looking like line noise. The borrow checker didn t bug me too much, but did add overhead to my thinking. The Rust compiler is generally very good at outputting helpful error messages when the programmer is an idiot. I ended up having to use a RefCell for one solution, and using .iter() for loops rather than explicit iterators (why, why is this different?). I also kept forgetting to explicitly mark variables as mutable when declaring them. Things I liked? There s a rich set of first class data types. Look, I m a C programmer, I m easily pleased. You give me some sort of hash array and I ll be happy. Rust manages that, tuples, strings, all the standard bits any modern language can provide. The whole impl thing for adding methods to structures I like as a way of providing some abstraction, though I think Go has a nicer syntax for it. The compiler, as mentioned, is great at spitting out useful errors for the most part. Also although I wasn t using external crates for AoC I do appreciate there s a decent ecosystem there now (though that brings up another gripe: rust seems to still be a fairly fast moving target, to the extent I can no longer rely on the compiler in Debian stable to be able to compile random projects I find).

Advent of Code Let s talk about the advent of code bit now. Hopefully it s long enough since it came out that this won t be spoilers for anyone, but if you haven t attempted the 2023 AoC and might, you might want to stop reading here. First, a refresher on the format for those who might not be aware of it. Problems are posted daily from December 1st until the 25th. Each is in 2 parts; the second part is not viewable until you have provided the correct answer for the first part. There s a whole leaderboard thing going on, but the puzzle opens at midnight UTC-5 so generally by the time I wake up and have time to look the problem has been solved many times over; no chance of getting listed. Credit to AoC creator, Eric Wastl, for writing up the set of problems in an entertaining fashion. I quite enjoyed seeing how the puzzle would be phrased each day, and the whole thing obviously brings a lot of joy to folk I know. I always start AoC thinking it ll be a fun set of puzzles to solve. Then something happens and I miss a day or two, and all of a sudden I ve a bunch of catching up to do and it s all a bit more of a chore. I hit that at some points this time, but made a concerted effort to try and power through it. That perseverance was required up front, because I found the second part of Day 1 to be ill specified, and had to iterate a few times to actually calculate the desired solution (IIRC, issues about whether sevenone at the end of a line ended up as 7 or 1 really tripped me up). I don t recall any other problems that bit me as hard on the specification as this one, but it happening up front was unfortunate. The short example input doesn t always help with this either; either it s not enough to be able to extrapolate patterns, or it doesn t show all the variations you need to account for (that aren t fully specified in the text), or in a few cases it turned out I needed to understand the shape of the actual data to produce a solution that could actually complete in a reasonable time. Which brings me to another matter, sometimes brute force doesn t actually work. This is fine, but the second part of the day s problem can change the approach you d take. So sometimes I got lucky in the way I handled the first half, and doing the second half was a simple 5 minute tweak, and sometimes I had to entirely change the way I was storing data. You might claim that if I was a better programmer I d have always produced a first half solution that was amenable to extension for the second half. First, I dispute that; I think there are always situations where the problem domain can change in enough directions that you can t handle all of them without a lot of effort. Secondly, I didn t find AoC an environment that encouraged me to optimise for generic solutions. Maybe some of the puzzles in isolation would allow for that, but a month of daily problems to solve while still engaging in regular life meant I hacked things up, took short cuts based on the knowledge I had of the input data, etc, etc. Overall I can see the appeal, but the sheer quantity and the fact I write code as part of my day job just made it feel too much like a chore, rather than a fun mental exercise. I did wonder how they d look as a set of interview puzzles (obviously a subset, rather than all of them), but I m not sure how you d actually use them for that - I wouldn t want anyone to have to solve them in a live interview. So, in case it s not obvious, I m not planning to engage in AoC again this yet. But I m continuing to persevere with Rust (though most of my work stuff is thankfully still Go).

18 August 2024

Debian Brasil: Debian Day 2024 em Pouso Alegre/MG - Brasil

por Thiago Pezzo e Giovani Ferreira As celebra es locais do Dia do Debian 2024 tamb m aconteceram em Pouso Alegre, MG, Brasil. Neste ano conseguimos organizar dois dias de palestras! No dia 14 de agosto de 2024, quarta-feira pela manh , estivemos no campus Pouso Alegre do Instituto Federal de Educa o, Ci ncia e Tecnologia do Sul de Minas Gerais (IFSULDEMINAS). Fizemos a apresenta o introdut ria do Projeto Debian, sistema operacional e comunidade, para os tr s anos do Curso T cnico de Ensino M dio em Inform tica. O evento foi fechado para o IFSULDEMINAS e estiveram presentes por volta de 60 estudantes. J no dia 17 de agosto de 2024, um s bado pela manh , realizamos o evento aberto comunidade na Universidade do Vale do Sapuca (Univ s), com apoio institucional do Curso de Sistemas de Informa o. Falamos sobre o Projeto Debian com Giovani Ferreira (Debian Developer); sobre a equipe de tradu o Debian pt_BR com Thiago Pezzo; sobre experi ncias no dia a dia com uso de softwares livres com Virg nia Cardoso; e sobre como configurar um ambiente de desenvolvimento pronto para produ o usando Debian e Docker com Marcos Ant nio dos Santos. Encerradas as palestras, foram servidos salgadinhos, caf e bolo, enquanto os/as participantes conversavam, tiravam d vidas e partilhavam experi ncias. Gostar amos de agradecer a todas as pessoas que nos ajudaram: Algumas fotos: Apresenta o no campus Pouso Alegre do IFSULDEMINAS 1 Apresenta o no campus Pouso Alegre do IFSULDEMINAS 2 Apresenta o no campus F tica da UNIV S 1 Apresenta o no campus F tica da UNIV S 2 Apresenta o no campus F tica da UNIV S 3 Apresenta o no campus F tica da UNIV S 4

11 August 2024

Ravi Dwivedi: My Austrian Visa Refusal Story

Vienna - the capital of Austria - is one of the most visited cities in the world, popular for its rich history, gardens, and cafes, along with well-known artists like Beethoven, Mozart, G del, and Freud. It has also been consistently ranked as the most livable city in the world. For these reasons, I was elated when my friend Snehal invited me last year to visit Vienna for a few days. We included Christmas and New Year s Eve in my itinerary due to the city s popular Christmas markets and lively events. The festive season also ensured that Snehal had some days off for sightseeing. Indians require a visa to visit Austria. Since the travel dates were near, I rushed to book an appointment online with VFS Global in Delhi, and quickly arranged the required documents. However, at VFS, I found out that I had applied in the wrong appointment category (tourist), which depends on the purpose of the visit, and that my travel dates do not allow enough time for visa authorities to make a decision. Apparently, even if you plan to stay only for a part of the trip with the host, you need to apply under the category Visiting Friends and Family . Thus, I had to book another appointment under this category, and took the opportunity to shift my travel dates to allow at least 15 business days for the visa application to be processed, removing Christmas and New Year s Eve from my itinerary. The process went smoothly, and my visa application was submitted by VFS. For reference, here s a list of documents I submitted - The following charges were collected from me.
Service Description Amount (Indian Rupees)
Cash Handling Charge - SAC Code: (SAC:998599) 0
VFS Fee - India - SAC Code: (SAC:998599) 1,820
VISA Fee - India - SAC Code: 7,280
Convenience Fee - SAC Code: (SAC:998599) 182
Courier Service - SAC Code: (SAC:998599) 728
Courier Assurance - SAC Code: (SAC:998599) 182
Total 10,192
I later learned that the courier charges (728 INR) and the courier assurance charges (182 INR) mentioned above were optional. However, VFS didn t ask whether I wanted to include them. When the emabssy is done processing your application, it will send your passport back to VFS, from where you can either collect it yourself or get it couriered back home, which requires you to pay courier charges. However, courier assurance charges do not add any value as VFS cannot assure anything about courier and I suggest you get them removed. My visa application was submitted on the 21st of December 2023. A few days later, on the 29th of December 2023, I received an email from the Austrian embassy asking me to submit an additional document -
Subject: AUSTRIAN VISA APPLICATION - AMENDMENT REQUEST: Ravi Dwivedi VIS 4331 Dear Applicant, On 22.12.2023 your application for Visa C was registered at the Embassy. You are requested to kindly send the scanned copies of the following documents via email to the Embassy or submit the documents at the nearest VFS centre, for further processing of your application:
  • Kindly submit Electronic letter of guarantee EVE- Elektronische Verpflichtungserkl rung obtained from the Fremdenpolizeibeh rde of the sponsor s district in Austria. Once your host company/inviting company has obtained the EVE, please share the reference number (starting from DEL_____) received from the authorities, with the Embassy.
Kindly Note: It is in your own interest to fulfil the requirements as indicated above and submit the missing documents within 14 days of the receipt of this email. Otherwise a decision will be taken based on the documentation available. Sie werden in Ihrem Interesse ersucht, die gekennzeichneten M ngel so schnell wie m glich zu beheben bzw. fehlende Unterlagen umgehend nachzureichen, um die weitere Bearbeitung des Antrages zu erm glichen. Sollten Sie innerhalb 14 Tagen die gekennzeichneten M ngel nicht beheben bzw. die fehlenden Unterlagen nicht nachreichen, wird ber den vorliegenden Antrag ohne diese Unterlagen bzw. M ngelbehebung entschieden. Austrian Embassy New Delhi R.J/ Consular Section +91 11 2419 2700 EP-13, Chandragupta Marg, Chanakyapuri, New Delhi 110 021, India bmeia.gv.at/botschaft/new-delhi facebook.at/AustrianEmbassyNewDelhihttp://www.facebook.at/AustrianEmbassyNewDelhi twitter.com/MFA_Austriahttp://www.twitter.com/MFA_Austria [refocus1][Signatur_V+30]https://www.bmeia.gv.at/en/european-foreign-policy/foreign-trade/refocus-austria/[Logo_AT_IN_22px]
I misunderstood the required document (the EVE) to be a scanned copy of the letter of guarantee form signed by Snehal, and responded by attaching it. Upon researching, Snehal determined that the document is an electronic letter of guarantee, and is supposed to be obtained at a local police station in Vienna. He visited a police station the next day and had a hard time conversing due to the language barrier (German is the common language in Austria, whereas Snehal speaks English). That day was a weekend, so he took an appointment for Monday, but in the meantime the embassy had finished processing my visa. My visa was denied, and the refusal letter stated:
The Austrian embassy in Delhi examined your application; the visa has been refused. The decision is based on the following reason(s):
  • The information submitted regarding the justification for the purpose and conditions of the intended stay was not reliable.
  • There are reasonable doubts as to your intention to leave the territory of the Member States before the expiry of the visa.
Other remarks: You have been given an amendment request, which you have failed to fulfil, or have only fulfilled inadequately, within the deadline set. You are a first-time traveller. The social and economic roots with the home country are not evident. The return from Schengen territory does therefore not seem to be certain.
I could have reapplied after obtaining the EVE, but I didn t because I found the following line
The social and economic roots with the home country are not evident.
offensive for someone who was born and raised in India, got the impression that the absence of electronic guarantee letter was not the only reason behind the refusal, had already wasted 12,000 INR on this application, and my friend s stay in Austria was uncertain after January. In fact, my friend soon returned to India. To summarize -
  1. If you are visiting a host, then the category of appointment at VFS must be Visiting Friends and Family rather than Tourist .
  2. VFS charged me for courier assurance, which is an optional service. Make sure to get these removed from your bill.
  3. Neither my travel agent nor the VFS application center mentioned the EVE.
  4. While the required documents list from the VFS website does mention it in point 6, it leads to a dead link.
  5. Snehal informed me that a mere two months ago, his wife s visa was approved without an EVE. This hints at inconsistency in processing of applications, even those under identical categories.
Such incidents are a waste of time and money for applicants, and an embarrassment to VFS and the Austrian visa authorities. I suggest that the Austrian visa authorities fix that URL, and provide instructions for hosts to obtain the EVE. Credits to Snehal and Contrapunctus for editing, Badri for proofreading.

8 August 2024

Reproducible Builds: Reproducible Builds in July 2024

Welcome to the July 2024 report from the Reproducible Builds project! In our reports, we outline what we ve been up to over the past month and highlight news items in software supply-chain security more broadly. As always, if you are interested in contributing to the project, please visit our Contribute page on our website. Table of contents:
  1. Reproducible Builds Summit 2024
  2. Pulling Linux up by its bootstraps
  3. Towards Idempotent Rebuilds?
  4. AROMA: Automatic Reproduction of Maven Artifacts
  5. Community updates
  6. Android Reproducible Builds at IzzyOnDroid with rbtlog
  7. Extending the Scalability, Flexibility and Responsiveness of Secure Software Update Systems
  8. Development news
  9. Website updates
  10. Upstream patches
  11. Reproducibility testing framework


Reproducible Builds Summit 2024 Last month, we were very pleased to announce the upcoming Reproducible Builds Summit, set to take place from September 17th 19th 2024 in Hamburg, Germany. We are thrilled to host the seventh edition of this exciting event, following the success of previous summits in various iconic locations around the world, including Venice, Marrakesh, Paris, Berlin and Athens. Our summits are a unique gathering that brings together attendees from diverse projects, united by a shared vision of advancing the Reproducible Builds effort. During this enriching event, participants will have the opportunity to engage in discussions, establish connections and exchange ideas to drive progress in this vital field. Our aim is to create an inclusive space that fosters collaboration, innovation and problem-solving. If you re interesting in joining us this year, please make sure to read the event page, which has more details about the event and location. We are very much looking forward to seeing many readers of these reports there.

Pulling Linux up by its bootstraps (LWN) In a recent edition of Linux Weekly News, Daroc Alden has written an article on bootstrappable builds. Starting with a brief introduction that
a bootstrappable build is one that builds existing software from scratch for example, building GCC without relying on an existing copy of GCC. In 2023, the Guix project announced that the project had reduced the size of the binary bootstrap seed needed to build its operating system to just 357-bytes not counting the Linux kernel required to run the build process.
The article goes onto to describe that now, the live-bootstrap project has gone a step further and removed the need for an existing kernel at all. and concludes:
The real benefit of bootstrappable builds comes from a few things. Like reproducible builds, they can make users more confident that the binary packages downloaded from a package mirror really do correspond to the open-source project whose source code they can inspect. Bootstrappable builds have also had positive effects on the complexity of building a Linux distribution from scratch [ ]. But most of all, bootstrappable builds are a boon to the longevity of our software ecosystem. It s easy for old software to become unbuildable. By having a well-known, self-contained chain of software that can build itself from a small seed, in a variety of environments, bootstrappable builds can help ensure that today s software is not lost, no matter where the open-source community goes from here

Towards Idempotent Rebuilds? Trisquel developer Simon Josefsson wrote an interesting blog post comparing the output of the .deb files from our tests.reproducible-builds.org testing framework and the ones in the official Debian archive. Following up from a previous post on the reproducibility of Trisquel, Simon notes that typically [the] rebuilds do not match the official packages, even when they say the package is reproducible , Simon correctly identifies that the purpose of [these] rebuilds are not to say anything about the official binary build, instead the purpose is to offer a QA service to maintainers by performing two builds of a package and declaring success if both builds match. However, Simon s post swiftly moves on to announce a new tool called debdistrebuild that performs rebuilds of the difference between two distributions in a GitLab pipeline and displays diffoscope output for further analysis.

AROMA: Automatic Reproduction of Maven Artifacts Mehdi Keshani, Tudor-Gabriel Velican, Gideon Bot and Sebastian Proksch of the Delft University of Technology, Netherlands, have published a new paper in the ACM Software Engineering on a new tool to automatically reproduce Apache Maven artifacts:
Reproducible Central is an initiative that curates a list of reproducible Maven libraries, but the list is limited and challenging to maintain due to manual efforts. [We] investigate the feasibility of automatically finding the source code of a library from its Maven release and recovering information about the original release environment. Our tool, AROMA, can obtain this critical information from the artifact and the source repository through several heuristics and we use the results for reproduction attempts of Maven packages. Overall, our approach achieves an accuracy of up to 99.5% when compared field-by-field to the existing manual approach [and] we reveal that automatic reproducibility is feasible for 23.4% of the Maven packages using AROMA, and 8% of these packages are fully reproducible.

Community updates On our mailing list this month:
  • Nichita Morcotilo reached out to the community, first to share their efforts to build reproducible packages cross-platform with a new build tool called rattler-build, noting that as you can imagine, building packages reproducibly on Windows is the hardest challenge (so far!) . Nichita goes onto mention that the Apple ecosystem appears to be using ZERO_AR_DATE over SOURCE_DATE_EPOCH. [ ]
  • Roland Clobus announced that the Debian bookworm 12.6 live images are nearly reproducible , with more detail in the post itself and input in the thread from other contributors.
  • As reported in last month s report, Pol Dellaiera completed his master thesis on Reproducibility in Software Engineering at the University of Mons, Belgium. This month, Pol announced this on the list with more background info. Since the master thesis sources have been available, it has received some feedback and contributions. As a result, an updated version of the thesis has been published containing those community fixes.
  • Daniel Gr ber asked for help in getting the Yosys documentation to build reproducibly, citing issues in inter alia the PDF generation causing differing CreationDate metadata values.
  • James Addison continued his long journey towards getting the Sphinx documentation generator to build reproducible documentation. In this thread, James concerns himself with the problem that even when SOURCE_DATE_EPOCH is configured, Sphinx projects that have configured their copyright notices using dynamic elements can produce nonsensical output under some circumstances. James query ended up generating a number of replies.
  • Allen gunner Gunner posted a brief update on the progress the core team is making towards introducing a Code of Conduct (CoC) such that it is in place in time for the RB Summit in Hamburg in September . In particular, gunner asks if you are interested in helping with CoC design and development in the weeks ahead, simply email rb-core@lists.reproducible-builds.org and let us know . [ ]

Android Reproducible Builds at IzzyOnDroid with rbtlog On our mailing list, Fay Stegerman announced a new Reproducible Builds collaboration in the Android ecosystem:
We are pleased to announce Reproducible Builds, special client support and more in our repo : a collaboration between various independent interoperable projects: the IzzyOnDroid team, 3rd-party clients Droid-ify & Neo Store, and rbtlog (part of my collection of tools for Android Reproducible Builds) to bring Reproducible Builds to IzzyOnDroid and the wider Android ecosystem.

Extending the Scalability, Flexibility and Responsiveness of Secure Software Update Systems Congratulations to Marina Moore of the New York Tandon School of Engineering who has submitted her PhD thesis on Extending the Scalability, Flexibility and Responsiveness of Secure Software Update Systems. The introduction outlines its contributions to the field:
[S]oftware repositories are a vital component of software development and release, with packages downloaded both for direct use and to use as dependencies for other software. Further, when software is updated due to patched vulnerabilities or new features, it is vital that users are able to see and install this patched version of the software. However, this process of updating software can also be the source of attack. To address these attacks, secure software update systems have been proposed. However, these secure software update systems have seen barriers to widespread adoption. The Update Framework (TUF) was introduced in 2010 to address several attacks on software update systems including repository compromise, rollback attacks, and arbitrary software installation. Despite this, compromises continue to occur, with millions of users impacted by such compromises. My work has addressed substantial challenges to adoption of secure software update systems grounded in an understanding of practical concerns. Work with industry and academic communities provided opportunities to discover challenges, expand adoption, and raise awareness about secure software updates. [ ]

Development news In Debian this month, 12 reviews of Debian packages were added, 13 were updated and 6 were removed this month adding to our knowledge about identified issues. A new toolchain issue type was identified as well, specifically ordering_differences_in_pkg_info.
Colin Percival filed a bug against the LLVM compiler noting that building i386 binaries on the i386 architecture is different when building i386 binaries under amd64. The fix was narrowed down to x87 excess precision, which can result in slightly different register choices when the compiler is hosted on x86_64 or i386 and a fix committed. [ ]
Fay Stegerman performed some in-depth research surrounding her apksigcopier tool, after some Android .apk files signed with the latest apksigner could no longer be verified as reproducible. Fay identified the issue as follows:
Since build-tools >= 35.0.0-rc1, backwards-incompatible changes to apksigner break apksigcopier as it now by default forcibly replaces existing alignment padding and changed the default page alignment from 4k to 16k (same as Android Gradle Plugin >= 8.3, so the latter is only an issue when using older AGP). [ ]
She documented multiple available workarounds and filed a bug in Google s issue tracker.
Lastly, diffoscope is our in-depth and content-aware diff utility that can locate and diagnose reproducibility issues. This month, Chris Lamb uploaded version 272 and Mattia Rizzolo uploaded version 273 to Debian, and the following changes were made as well:
  • Chris Lamb:
    • Ensure that the convert utility is from ImageMagick version 6.x. The command-line interface has seemingly changed with the 7.x series of ImageMagick. [ ]
    • Factor out version detection in test_jpeg_image. [ ]
    • Correct the import of the identify_version method after a refactoring change in a previous commit. [ ]
    • Move away from using DSA OpenSSH keys in tests as support has been deprecated and removed in OpenSSH version 9.8p1. [ ]
    • Move to assert_diff in the test_openssh_pub_key package. [ ]
    • Update copyright years. [ ]
  • Mattia Rizzolo:
    • Add support for ffmpeg version 7.x which adds some extra context to the diff. [ ]
    • Rework the handling of OpenSSH testing of DSA keys if OpenSSH is strictly 9.7, and add an OpenSSH key test with a ed25519-format key [ ][ ][ ]
    • Temporarily disable a few packages that are not available in Debian testing. [ ][ ]
    • Stop ignoring the results of Debian testing in the continuous integration system. [ ]
    • Adjust options in debian/source to make sure not to pack the Python sdist directory into the binary Debian package. [ ]
    • Adjust Lintian overrides. [ ]

Website updates There were a number of improvements made to our website this month, including:

Upstream patches The Reproducible Builds project detects, dissects and attempts to fix as many currently-unreproducible packages as possible. We endeavour to send all of our patches upstream where appropriate. This month, we wrote a large number of such patches, including:

Reproducibility testing framework The Reproducible Builds project operates a comprehensive testing framework running primarily at tests.reproducible-builds.org in order to check packages and other artifacts for reproducibility. In July, a number of changes were made by Holger Levsen, including:
  • Grant bremner access to the ionos7 node. [ ][ ]
  • Perform a dummy change to force update of all jobs. [ ][ ]
In addition, Vagrant Cascadian performed some necessary node maintenance of the underlying build hosts. [ ]

If you are interested in contributing to the Reproducible Builds project, please visit our Contribute page on our website. However, you can get in touch with us via:

30 July 2024

Russell Coker: Links July 2024

Interesting Scientific American article about the way that language shapes thought processes and how it was demonstrated in eye tracking experiments with people who have Aboriginal languages as their first language [1]. David Brin wrote an interesting article Do We Really Want Immortality [2]. I disagree with his conclusions about the politics though. Better manufacturing technology should allow decreasing the retirement age while funding schools well. Scientific American has a surprising article about the differences between Chimp and Bonobo parenting [3]. I d never have expected Chimp moms to be protective. Sam Varghese wrote an insightful and informative article about the corruption in Indian politics and the attempts to silence Australian journalist Avani Dias [4]. WorksInProgress has an insightful article about the world s first around the world solo yacht race [5]. It has some interesting ideas about engineering. Htwo has an interesting video about adverts for fake games [6]. It s surprising how they apparently make money from advertising games that don t exist. Elena Hashman wrote an insightful blog post about Chronic Fatigue Syndrome [7]. I hope they make some progress on curing it soon. The fact that it seems similar to long Covid which is quite common suggests that a lot of research will be applied to that sort of thing. Bruce Schneier wrote an insightful blog post about the risks of MS Copilot [8]. Krebs has an interesting article about how Apple does Wifi AP based geo-location and how that can be abused for tracking APs in warzones etc. Bad Apple! [9]. Bruce Schneier wrote an insightful blog post on How AI Will Change Democracy [10]. Charles Stross wrote an amusing and insightful post about MS Recall titled Is Microsoft Trying to Commit Suicide [11]. Bruce Schneier wrote an insightful blog post about seeing the world as a data structure [12]. Luke Miani has an informative YouTube video about eBay scammers selling overprices MacBooks [13]. The Yorkshire Ranter has an insightful article about Ronald Coase and the problems with outsourcing big development contracts as an array of contracts without any overall control [14].

21 July 2024

Mike Gabriel: Polis - a FLOSS Tool for Civic Participation -- Issues extending Polis and adjusting our Goals

Here comes the 3rd article of the 5-episode blog post series on Polis, written by Guido Berh rster, member of staff at my company Fre(i)e Software GmbH. Enjoy also this read on Guido's work on Polis,
Mike
Table of Contents of the Blog Post Series
  1. Introduction
  2. Initial evaluation and adaptation
  3. Issues extending Polis and adjusting our goals (this article)
  4. Creating (a) new frontend(s) for Polis
  5. Current status and roadmap
Polis - Issues extending Polis and adjusting our Goals After the initial implementation of limited branding support, user feedback and the involvement of an UX designer lead to the conclusion that we needed more far-reaching changes to the user interface in order to reduce visual clutter, rearrange and improve UI elements, and provide better integration with the websites in which conversations are embedded. Challenges when visualizing Data in Polis Polis visualizes groups using a spatial projection of users based on similarities in voting behavior and places them in two to five groups using a clustering algorithm. During our testing and evaluation users were rarely able to interpret the visualization and often intuitively made incorrect assumptions e.g. by associating the filled area of a group with its significance or size. After consultation with a member of the Multi-Agent Systems (MAS) Group at the University of Groningen we chose to temporarily replace the visualization offered by Polis with simple bar charts representing agreement or disagreement with statements of a group or the majority. We intend to revisit this and explore different forms of visualization at a later point in time. The different factors playing into the weight attached to statements which determine the pseuodo-random order in which they are presented for voting ( comment routing ) proved difficult to explain to stakeholders and users and the admission of the ad-hoc and heuristic nature of the used algorithm1 by Polis authors lead to the decision to temporarily remove this feature. Instead, statements should be placed into three groups, namely
  1. metadata questions,
  2. seed statements,
  3. and participant statements
Statements should then be sorted by group but in a fully randomized order within the group so that metadata questions would be presented before seed statements which would be presented before participant s statements. This simpler method was deemed sufficient for the scale of our pilot projects, however we intend to revisit this decision and explore different methods of comment routing in cooperation with our scientific partners at a later point in time. An evaluation of the requirements for implementing mandatory authentication and adding support for additional authentication methods to Polis showed that significant changes to both the administration and participation frontend were needed due to a lack of an abstraction layer or extension mechanism and the current authentication providers being hardcoded in many parts of the code base. A New Frontend is born: Particiapp Based on the implementation details of the participation frontend, the invasive nature of the changes required, and the overhead of keeping up with active upstream development it became clear that a different, more flexible approach to development was needed. This ultimately lead to the creation of Particiapp, a new Open Source project providing the building blocks and necessary abstraction layers for rapid protoyping and experimentation with different fontends which are compatible with but independent from Polis.
  1. Small, Christopher T., Bjorkegren, Michael, Erkkil , Timo, Shaw, Lynette and Megill, Colin (2021). Polis: Scaling deliberation by mapping high dimensional opinion spaces. Recerca. Revista de Pensament i An lisi, 26(2), pp. 1-26.

18 July 2024

Enrico Zini: meson, includedir, and current directory

Suppose you have a meson project like this: meson.build:
project('example', 'cpp', version: '1.0', license : ' ', default_options: ['warning_level=everything', 'cpp_std=c++17'])
subdir('example')
example/meson.build:
test_example = executable('example-test', ['main.cc'])
example/string.h:
/* This file intentionally left empty */
example/main.cc:
#include <cstring>
int main(int argc,const char* argv[])
 
    std::string foo("foo");
    return 0;
 
This builds fine with autotools and cmake, but not meson:
$ meson setup builddir
The Meson build system
Version: 1.0.1
Source dir: /home/enrico/dev/deb/wobble-repr
Build dir: /home/enrico/dev/deb/wobble-repr/builddir
Build type: native build
Project name: example
Project version: 1.0
C++ compiler for the host machine: ccache c++ (gcc 12.2.0 "c++ (Debian 12.2.0-14) 12.2.0")
C++ linker for the host machine: c++ ld.bfd 2.40
Host machine cpu family: x86_64
Host machine cpu: x86_64
Build targets in project: 1
Found ninja-1.11.1 at /usr/bin/ninja
$ ninja -C builddir
ninja: Entering directory  builddir'
[1/2] Compiling C++ object example/example-test.p/main.cc.o
FAILED: example/example-test.p/main.cc.o
ccache c++ -Iexample/example-test.p -Iexample -I../example -fdiagnostics-color=always -D_FILE_OFFSET_BITS=64 -Wall -Winvalid-pch -Wextra -Wpedantic -Wcast-qual -Wconversion -Wfloat-equal -Wformat=2 -Winline -Wmissing-declarations -Wredundant-decls -Wshadow -Wundef -Wuninitialized -Wwrite-strings -Wdisabled-optimization -Wpacked -Wpadded -Wmultichar -Wswitch-default -Wswitch-enum -Wunused-macros -Wmissing-include-dirs -Wunsafe-loop-optimizations -Wstack-protector -Wstrict-overflow=5 -Warray-bounds=2 -Wlogical-op -Wstrict-aliasing=3 -Wvla -Wdouble-promotion -Wsuggest-attribute=const -Wsuggest-attribute=noreturn -Wsuggest-attribute=pure -Wtrampolines -Wvector-operation-performance -Wsuggest-attribute=format -Wdate-time -Wformat-signedness -Wnormalized=nfc -Wduplicated-cond -Wnull-dereference -Wshift-negative-value -Wshift-overflow=2 -Wunused-const-variable=2 -Walloca -Walloc-zero -Wformat-overflow=2 -Wformat-truncation=2 -Wstringop-overflow=3 -Wduplicated-branches -Wattribute-alias=2 -Wcast-align=strict -Wsuggest-attribute=cold -Wsuggest-attribute=malloc -Wanalyzer-too-complex -Warith-conversion -Wbidi-chars=ucn -Wopenacc-parallelism -Wtrivial-auto-var-init -Wctor-dtor-privacy -Weffc++ -Wnon-virtual-dtor -Wold-style-cast -Woverloaded-virtual -Wsign-promo -Wstrict-null-sentinel -Wnoexcept -Wzero-as-null-pointer-constant -Wabi-tag -Wuseless-cast -Wconditionally-supported -Wsuggest-final-methods -Wsuggest-final-types -Wsuggest-override -Wmultiple-inheritance -Wplacement-new=2 -Wvirtual-inheritance -Waligned-new=all -Wnoexcept-type -Wregister -Wcatch-value=3 -Wextra-semi -Wdeprecated-copy-dtor -Wredundant-move -Wcomma-subscript -Wmismatched-tags -Wredundant-tags -Wvolatile -Wdeprecated-enum-enum-conversion -Wdeprecated-enum-float-conversion -Winvalid-imported-macros -std=c++17 -O0 -g -MD -MQ example/example-test.p/main.cc.o -MF example/example-test.p/main.cc.o.d -o example/example-test.p/main.cc.o -c ../example/main.cc
In file included from ../example/main.cc:1:
/usr/include/c++/12/cstring:77:11: error:  memchr  has not been declared in  :: 
   77     using ::memchr;
                  ^~~~~~
/usr/include/c++/12/cstring:78:11: error:  memcmp  has not been declared in  :: 
   78     using ::memcmp;
                  ^~~~~~
/usr/include/c++/12/cstring:79:11: error:  memcpy  has not been declared in  :: 
   79     using ::memcpy;
                  ^~~~~~
/usr/include/c++/12/cstring:80:11: error:  memmove  has not been declared in  :: 
   80     using ::memmove;
                  ^~~~~~~
 
It turns out that meson adds the current directory to the include path by default:
Another thing to note is that include_directories adds both the source directory and corresponding build directory to include path, so you don't have to care.
It seems that I have to care after all. Thankfully there is an implicit_include_directories setting that can turn this off if needed. Its documentation is not as easy to find as I'd like (kudos to Kangie on IRC), and hopefully this blog post will make it easier for me to find it in the future.

13 July 2024

Anuradha Weeraman: Windows of Opportunity: Microsoft's Open Source Renaissance

Windows of Opportunity: Microsoft's Open Source RenaissanceTwenty years ago, it was easy to dislike Microsoft. It was the quintessential evil MegaCorp that was quick to squash competition, often ruthlessly, but in some cases slowly through a more insidious process of embracing, extending, and exterminating anything that got in the way. This was the signature personality of Ballmer-era Microsoft that also inspired and united the software freedom fighting forces that came together to safeguard things that mattered to them and were at risk.I remember the era when the Novell, SCO, and Microsoft saga cast fear, uncertainty, and doubt on the future of open Unix and Linux and on what would happen to the operating systems that we loved if the suits of Redmond prevailed. Looking back, I&aposm glad that the arc of this story has bent towards justice, and I shudder at the possibilities had it worked out differently.Looking at today&aposs Microsoft, I&aposm amazed at how much change a leader with the right vision can make to the trajectory of a company that even makes an old-school software freedom advocate as me admire and even applaud the strides it has taken in the last 10 or so years that has dramatically shifted the perception of Microsoft. The personality of the Satya-era Microsoft is one to behold. While it will take more time to win back the trust, we see the tides changing and the positivity is important for the entire industry.For Microsoft, it was TypeScript and VS Code that helped change the narrative internally which led to its internal resurgence and acceptance of open source. Its acquisition of GitHub propelled it forward within the community overnight. Its contributions to the Linux kernel and other major software projects have also been consequential in changing its public perceptions.It takes a while to claw back trust and is very easy to breach. This time, however, Microsoft seems to understand this dynamic more than it did 20 years ago. All it took was the right leadership.

12 July 2024

Reproducible Builds: Reproducible Builds in June 2024

Welcome to the June 2024 report from the Reproducible Builds project! In our reports, we outline what we ve been up to over the past month and highlight news items in software supply-chain security more broadly. As always, if you are interested in contributing to the project, please visit our Contribute page on our website. Table of contents:
  1. Next Reproducible Builds Summit dates announced
  2. GNU Guix patch review session for reproducibility
  3. New reproducibility-related academic papers
  4. Misc development news
  5. Website updates
  6. Reproducibility testing framework


Next Reproducible Builds Summit dates announced We are very pleased to announce the upcoming Reproducible Builds Summit, set to take place from September 17th 19th 2024 in Hamburg, Germany. We are thrilled to host the seventh edition of this exciting event, following the success of previous summits in various iconic locations around the world, including Venice, Marrakesh, Paris, Berlin and Athens. Our summits are a unique gathering that brings together attendees from diverse projects, united by a shared vision of advancing the Reproducible Builds effort. During this enriching event, participants will have the opportunity to engage in discussions, establish connections and exchange ideas to drive progress in this vital field. Our aim is to create an inclusive space that fosters collaboration, innovation and problem-solving. If you re interesting in joining us this year, please make sure to read the event page which has more details about the event and location. We are very much looking forward to seeing many readers of these reports there.

GNU Guix patch review session for reproducibility Vagrant Cascadian will be holding a Reproducible Builds session as part of the monthly Guix patch review series on July 11th at 17:00 UTC. These online events are intended to encourage everyone everyone becoming a patch reviewer and the goal of reviewing patches is to help Guix project accept contributions while maintaining our quality standards and learning how to do patch reviews together in a friendly hacking session.

Development news In Debian this month, 4 reviews of Debian packages were added, 11 were updated and 14 were removed this month adding to our knowledge about identified issues. Only one issue types was updated, though, explaining that we don t vary the build path anymore.
On our mailing list this month, Bernhard M. Wiedemann wrote that whilst he had previously collected issues that introduce non-determinism he has now moved on to discuss about mitigations , in the sense of how can we avoid whole categories of problem without patching an infinite number of individual packages . In addition, Janneke Nieuwenhuizen announced the release of two versions of GNU Mes. [ ][ ]
In openSUSE news, Bernhard M. Wiedemann published another report for that distribution.
In NixOS, with the 24.05 release out, it was again validated that our minimal ISO is reproducible by building it on a virtual machine with no access to the binary cache.
What s more, we continued to write patches in order to fix specific reproducibility issues, including Bernhard M. Wiedemann writing three patches (for qutebrowser, samba and systemd), Chris Lamb filing Debian bug #1074214 against the fastfetch package and Arnout Engelen proposing fixes to refind and for the Scala compiler [ .
Lastly, diffoscope is our in-depth and content-aware diff utility that can locate and diagnose reproducibility issues. This month, Chris Lamb uploaded two versions (270 and 271) to Debian, and made the following changes as well:
  • Drop Build-Depends on liblz4-tool in order to fix Debian bug #1072575. [ ]
  • Update tests to support zipdetails version 4.004 that is shipped with Perl 5.40. [ ]

Website updates There were a number of improvements made to our website this month, including Akihiro Suda very helpfully making the <h4> elements more distinguishable from the <h3> level [ ][ ] as well as adding a guide for Dockerfile reproducibility [ ]. In addition Fay Stegerman added two tools, apksigcopier and reproducible-apk-tools, to our Tools page.

Reproducibility testing framework The Reproducible Builds project operates a comprehensive testing framework running primarily at tests.reproducible-builds.org in order to check packages and other artifacts for reproducibility. In June, a number of changes were made by Holger Levsen, including:
  • Marking the virt(32 64)c-armhf nodes as down. [ ]
  • Granting a developer access to the osuosl4 node in order to debug a regression on the ppc64el architecture. [ ]
  • Granting a developer access to the osuosl4 node. [ ][ ]
In addition, Mattia Rizzolo re-aligned the /etc/default/jenkins file with changes performed upstream [ ] and changed how configuration files are handled on the rb-mail1 host. [ ], whilst Vagrant Cascadian documented the failure of the virt32c and virt64c nodes after initial investigation [ ].

If you are interested in contributing to the Reproducible Builds project, please visit our Contribute page on our website. However, you can get in touch with us via:

8 July 2024

Russ Allbery: rra-c-util 11.0.0

rra-c-util is my collection of utility and test functions that I keep synchronized between my packages. The big change in this release is that I've switched to semantic versions, which I plan to do for all of my packages, and I've started using scriv to manage a Markdown change log. We've been using scriv for a while at work, so I have finally gotten on board the change log fragment train rather than assuming linear commits. This release also raises the minimum Perl version for all of the Perl support code to 5.12 so that I can use semantic versions for all modules, and updates the perltidy configuration with lots of improvements from Julien LIE. There are also the normal variaty of bug fixes and more minor improvements. You can get the latest version from the rra-c-util distribution page.

5 July 2024

Sahil Dhiman: Atleast Not Written by an AI

I keep on going back and correcting bootload of grammatical and other errors in my posts here. I somewhat feel embarrassed how such mistakes slip through when I was proofreading. Back then it was all good and suddenly this mistake cropped up in my text, which everyone might have already noticed by now. A thought just stuck around that. Those mistakes signify that the text is written by a real human, and humans makes mistakes. :) PS - Even LanguageTool (non-premium) couldn t identify those errors.

2 July 2024

Dima Kogan: vnlog.slurp() with non-numerical data

For a while now I'd see an annoying problem when trying to analyze data. I would be trying to import into numpy an innocuous-looking data file like this:
#  image   x y z temperature
image1.png 1 2 5 34
image2.png 3 4 1 35
As usual, I would be using vnlog.slurp() (a thin wrapper around numpy.loadtxt()) to read this in, but that doesn't work: the image filenames aren't parseable as numerical values. Up until now I would work around this by using the suprocess module to fork off a vnl-filter -p !image and then slurp that, but it's a pain and slow and has other issues. I just solved this conclusively using the numpy structured dtypes. I can now do this:
dtype = np.dtype([ ('image',       'U16'),
                   ('x y z',       int, (3,)),
                   ('temperature', float), ])
arr = vnlog.slurp("data.vnl", dtype=dtype)
This will read the image filename, the xyz points and the temperature into different sub-arrays, with different types each. Accessing the result looks like this:
print(arr['image'])
---> array(['image1.png', 'image2.png'], dtype='<U16')
print(arr['x y z'])
---> array([[1, 2, 5],
            [3, 4, 1]])
print(arr['temperature'])
---> array([34., 35.])
Notes: We still do not support records with any null values (-). This could probably be handled with the converters kwarg of numpy.loadtxt(), but that sounds slow. I'll look at that later. This is available today in vnlog 1.38.

1 June 2024

Russ Allbery: Review: I Shall Wear Midnight

Review: I Shall Wear Midnight, by Terry Pratchett
Series: Discworld #38
Publisher: Harper
Copyright: 2010
Printing: 2011
ISBN: 0-06-143306-3
Format: Trade paperback
Pages: 355
I Shall Wear Midnight is the 38th Discworld novel and the 4th Tiffany Aching novel. This is not a good place to start reading. Tiffany has finished her training and has returned to her home on the Chalk, taking up her duties as the local witch. There are a lot of those, because there's a lot that needs doing. In some cases, such as taking away the pain of the old Duke, they involve things that require magic and that only Tiffany can do. In many other cases, other people could pick up some of the work, but they lack Tiffany's sense of duty and willingness to pay attention. The people of the Chalk have always been a bit suspicious of witches, in part because the job was done for so long by Tiffany's grandmother and no one thought she was a witch. (She was a witch.) Of late, however, that suspicion seems to be getting worse. It comes to a head when Tiffany is accused of theft and worse by the old Duke's maid, a woman with very fixed ideas about the evils of witches. Tiffany has to sort out what's going on and clear herself, all while navigating her now-awkward relationship with the Duke's son Roland, his unimpressive fiancee, and his spectacularly annoying aunt. Ah, this is the stuff. This is exactly the Tiffany Aching novel that I have been hoping Pratchett would write. It's pure, snarky competence porn from start to finish.
"I'm a witch. It's what we do. When it's nobody else's business, it's my business."
One of the things that I adore about this series is how well Pratchett shows the different ways in which one can be a witch. Granny Weatherwax out-thinks everyone and nudges (or shoves) people in the right direction, but her natural tendency is to be icy and a bit frightening. Nanny Ogg is that person you can't help but talk to, who may seem happy-go-lucky and hedonistic but who can effortlessly change the mood of a room. And Tiffany is stubborn duty and blunt practicality, which fits the daughter of shepherds. In previous books, we've watched Tiffany as a student, learning the practicalities of being a witch. This is the book where she realizes how much she knows and how much easier the world is to navigate when she's in her own territory. There is a wonderful scene, late in this book, where Pratchett shows Nanny Ogg at her best, doing the kinds of things that only Nanny Ogg can do. Both Tiffany and the reader are in awe.
I should have learned this, she thought. I wanted to learn fire, and pain, but I should have learned people.
And it's true that Nanny Ogg can do things that Tiffany can't. But what makes this book so great is that it shows how Tiffany's personality and her training come together with her knowledge of the Chalk. She may not know people, in general, but she knows her neighbors and how they think. She doesn't manage them the way that Nanny Ogg would; she's better at solving different kinds of problems, in different ways. But they're the right ways, and the right problems, for her home. This is another Discworld novel with a forgettable villain that's more of a malevolent force of nature than a character in its own right. It's also another Discworld novel where Pratchett externalizes a human tendency into a malevolent force that can possess people. I have mixed feelings about this narrative approach. That externalization of evil into (in essence) demons has been repeatedly used to squirm out of responsibility and excuse atrocities, and it neatly avoids having to wrestle with the hard questions of prejudice and injustice and why apparently good people do awful things. I think some of those weaknesses persist even in Pratchett's hands, but I think what he was attempting with that approach in this book is to show how almost no one is immune to nastier ideas that spread through society. Rather than using the externalization of evil as an excuse, he's using it as a warning. With enough exposure to those ideas, they start sounding tempting and partly credible even to people who would never have embraced them earlier. Pratchett also does a good job capturing the way prejudice can start from thoughtless actions that have more to do with the specific circumstances of someone's life than any coherent strategy. Still, the one major complaint I have about this book is that the externalization of evil is an inaccurate portrayal of the world, and this catches up with Pratchett at the ending. Postulating an external malevolent force reduces evil to something that can be puzzled out and decisively defeated, thus resolving the problem. Sadly, this is not how humans actually work. I'll forgive that structural flaw, though, because the rest of this book is so good. It's rare that a plot twist in a Discworld novel surprises me twisty plots are not Pratchett's strength but this one did. I will not spoil the surprise, but one of the characters is not quite who they seem to be, and Tiffany's reactions once she figures that out are one of my favorite parts of this book. Pratchett is making a point about assumptions, observation, and the importance of being willing to change one's mind about someone when you know more, and I thought it was very well done. But, most of all, I enjoyed reading about Tiffany being calm, competent, determined, and capable. There's also a bit of an unexpected romance plot that's one of my favorite types: the person who notices that you're doing a lot of work and quietly steps in and starts helping while paying attention to what's needed and not taking over. And it's full of the sort of pithy moral wisdom that makes Discworld such a delight to read.
"There have been times, lately, when I dearly wished that I could change the past. Well, I can't, but I can change the present, so that when it becomes the past it will turn out to be a past worth having."
This was just what I wanted. Highly recommended. Followed by Snuff in publication order. The next (and last, sadly) Tiffany Aching book is The Shepherd's Crown. Rating: 9 out of 10

28 May 2024

Russell Coker: Creating a Micro Users Group

Fosdem had a great lecture Building an Open Source Community One Friend at a Time [1]. I recommend that everyone who is involved in the FOSS community watches this lecture to get some ideas. For some time I ve been periodically inviting a few friends to visit for lunch, chat about Linux, maybe do some coding, and watch some anime between coding. It seems that I have accidentally created a micro users group. LUGs were really big in the mid to late 90s and still quite vibrant in the early 2000 s. But they seem to have decreased in popularity even before Covid19 and since Covid19 a lot of people have stopped attending large meetings to avoid health risks. I think that a large part of the decline of users groups has been due to the success of YouTube. Being able to choose from thousands of hours of lectures about computers on YouTube is a disincentive to spending the time and effort needed to attend a meeting with content that s probably not your first choice of topic. Attending a formal meeting where someone you don t know has arranged a lecture might not have a topic that s really interesting to you. Having lunch with a couple of friends and watching a YouTube video that one of your friends assures you is really good is something more people will find interesting. In recent times homeschooling [2] has become more widely known. The same factors that allow learning about computers at home also make homeschooling easier. The difference between the traditional LUG model of having everyone meet at a fixed time for a lecture and a micro LUG of a small group of people having an informal meeting is similar to the difference between traditional schools and homeschooling. I encourage everyone to create their own micro LUG. All you have to do is choose a suitable time and place and invite some people who are interested. Have a BBQ in a park if the weather is good, meet at a cafe or restaurant, or invite people to visit you for lunch on a weekend.

27 May 2024

Sahil Dhiman: A Late, Late Debconf23 Post

After much procrastination, I have gotten around to complete my DebConf23 (DC23), Kochi blog post. I lost the original etherpad which was started before DebConf23, for jotting down things. Now, I have started afresh with whatever I can remember, months after the actual conference ended. So things might be as accurate as my memory. DebConf23, the 24th annual Debian Conference, happened in Infopark, Kochi, India from 10th September to 17th September 2023. It was preceded by DebCamp from 3rd September to 9th September 2023. First formal bid to host DebConf in India was made during DebConf18 in Hsinchu, Taiwan by Raju Dev, which didn t came our way. In next DebConf, DebConf19 in Curitiba, Brazil, another bid was made by him with help and support from Sruthi, Utkarsh and the whole team.This time, India got the opportunity to host DebConf22, which eventually became DebConf23 for the reasons you all know. I initially met the local team on the sidelines of DebConf20, which was also my first DebConf. Having recently switched to Debian, DC20 introduced me to how things work in Debian. Video team s call for volunteers email pulled me in. Things stuck, and I kept hanging out and helping the local Indian DC team with various stuff. We did manage to organize multiple events leading to DebConf23 including MiniDebConf India 2021 Online, MiniDebConf Palakkad 2022, MiniDebConf Tamil Nadu 2023 and DebUtsav Kochi 2023, which gave us quite a bit of experience and workout. Many local organizers from these conferences later joined various DebConf teams during the conference to help out. For DebConf23, originally I was part of publicity team because that was my usual thing. After a team redistribution exercise, Sruthi and Praveen moved me to sponsorship team, as anyhow we didn t had to do much publicity and sponsorship was one of those things I could get involved remotely. Sponsorship team had to take care of raising funds by reaching out to sponsors, managing invoices and fulfillment. Praveen joined as well in sponsorship team. We also had international sponsorship team, Anisa, Daniel and various Debian Trusted Organizations (TO)s which took care of reaching out to international organizations, and we took care of reaching out to Indian organizations for sponsorship. It was really proud moment when my present employer, Unmukti (makers of hopbox) came aboard as Bronze sponsor. Though fundraising seem to be hit hard from tech industry slowdown and layoffs. Many of our yesteryear sponsors couldn t sponsor. We had biweekly local team meetings, which were turned to weekly as we neared the event. This was done in addition to biweekly global team meeting. Pathu
Pathu, DebConf23 mascot
To describe the conference venue, it happened in InfoPark, Kochi with the main conference hall being Athulya Hall and food, accommodation and two smaller halls in Four Point Hotel, right outside Infopark. We got Athulya Hall as part of venue sponsorship from Infopark. The distance between both of them was around 300 meters. Halls were named Anamudi, Kuthiran and Ponmudi based on hills and mountain areas in host state of Kerala. Other than Annamudi hall which was the main hall, I couldn t remember the names of the hall, I still can t. Four Points was big and expensive, and we had, as expected, cost overruns. Due to how DebConf function, an Indian university wasn t suitable to host a conference of this scale. Infinity Pool at Night
Four Point's Infinity Pool at Night
I landed in Kochi on the first day of DebCamp on 3rd September. As usual, met Abraham first, and the better part of the next hour was spent on meet and greet. It was my first IRL DebConf so met many old friends and new folks. I got a room to myself. Abraham lived nearby and hadn t taken the accommodation, so I asked him to join. He finally joined from second day onwards. All through the conference, room 928 became in-famous for various reasons, and I had various roommates for company. In DebCamp days, we would get up to have breakfast and go back to sleep and get active only past lunch for hacking and helping in the hack lab for the day, followed by fun late night discussions and parties. Nilesh, Chirag and Apple at DC23
Nilesh, Chirag and Apple at DC23
The team even managed to get a press conference arranged as well, and we got an opportunity to go to Press Club, Ernakulam. Sruthi and Jonathan gave the speech and answered questions from journalists. The event was covered by media as well due to this. Ernakulam Press Club
Ernakulam Press Club
During the conference, every night the team use to have 9 PM meetings for retrospection and planning for next day, which was always dotted with new problems. Every day, we used to hijack Silent Hacklab for the meeting and gently ask the only people there at the time to give us space. DebConf, it itself is a well oiled machine. Network was brought up from scratch. Video team built the recording, audio mixing, live-streaming, editing and transcoding infrastructure on site. A gaming rig served as router and gateway. We got internet uplinks, a 1 Gbps sponsored leased line from Kerala Vision and a paid backup 100 Mbps connection from a different provider. IPv6 was added through HE s Tunnelbroker. Overall the network worked fine as additionally we had hotel Wi-Fi, so the conference network wasn t stretched much. I must highlight, DebConf is my only conference where almost everything and every piece of software in developed in-house, for the conference and modified according to need on the fly. Even event recording cameras, audio check, direction, recording and editing is all done on in-house software by volunteer-attendees (in some cases remote ones as well), all trained on the sideline of the conference. The core recording and mixing equipment is owned by Debian and travels to each venue. The rest is sourced locally. Gaming Rig which served as DC23 gateway router
Gaming Rig which served as DC23 gateway router
It was fun seeing how almost all the things were coordinated over text on Internet Relay Chat (IRC). If a talk/event was missing a talkmeister or a director or a camera person, a quick text on #debconf channel would be enough for someone to volunteer. Video team had a dedicated support channel for each conference venue for any issues and were quick to respond and fix stuff. Network information. Screengrab from closing ceremony
Network information. Screengrab from closing ceremony
It rained for the initial days, which gave us a cool weather. Swag team had decided to hand out umbrellas in swag kit which turned out to be quite useful. The swag kit was praised for quality and selection - many thanks to Anupa, Sruthi and others. It was fun wearing different color T-shirts, all designed by Abraham. Red for volunteers, light green for Video team, green for core-team i.e. staff and yellow for conference attendees. With highvoltage
With highvoltage
We were already acclimatized by the time DebConf really started as we had been talking, hacking and hanging out since last 7 days. Rush really started with the start of DebConf. More people joined on the first and second day of the conference. As has been the tradition, an opening talk was prepared by the Sruthi and local team (which I highly recommend watching to get more insights of the process). DebConf day 1 also saw job fair, where Canonical and FOSSEE, IIT Bombay had stalls for community interactions, which judging by the crowd itself turned out to be quite a hit. For me, association with DebConf (and Debian) started due to volunteering with video team, so anyhow I was going to continue doing that this conference as well. I usually volunteer for talks/events which anyhow I m interested in. Handling the camera, talkmeister-ing and direction are fun activities, though I didn t do sound this time around. Sound seemed difficult, and I didn t want to spoil someone s stream and recording. Talk attendance varied a lot, like in Bits from DPL talk, the hall was full but for some there were barely enough people to handle the volunteering tasks, but that s what usually happens. DebConf is more of a place to come together and collaborate, so talk attendance is an afterthought sometimes. Audience in highvoltage's Bits from DPL talk
Audience in highvoltage's Bits from DPL talk
I didn t submit any talk proposals this time around, as just being in the orga team was too much work already, and I knew, the talk preparation would get delayed to the last moment and I would have to rush through it. Enrico's talk
Enrico's talk
From Day 2 onward, more sponsor stalls were introduced in the hallway area. Hopbox by Unmukti , MostlyHarmless and Deeproot (joint stall) and FOSEE. MostlyHarmless stall had nice mechanical keyboards and other fun gadgets. Whenever I got the time, I would go and start typing racing to enjoy the nice, clicky keyboards. As the DebConf tradition dictates, we had a Cheese and Wine party. Everyone brought in cheese and other delicacies from their region. Then there was yummy Sadya. Sadya is a traditional vegetarian Malayalis lunch served over banana leaves. There were loads of different dishes served, the names of most I couldn t pronounce or recollect properly, but everything was super delicious. Day 4 was day trip and I chose to go to Athirappilly Waterfalls and Jungle safari. Pictures would describe the beauty better than words. The journey was bit long though. Athirappilly Falls
Athirappilly Falls

Pathu Pathu Tea Gardens
Tea Gardens
Late that day, we heard the news of Abraham gone missing. We lost Abraham. He had worked really hard all through the years for Debian and making this conference possible. Talks were cancelled for the next day and Jonathan addressed everyone. We went to Abraham s home the next day to meet his family. Team had arranged buses to Abraham s place. It was an unfortunate moment that I only got an opportunity to visit his place after he was gone. Days went by slowly after that. The last day was marked by a small conference dinner. Some of the people had already left. All through the day and next, we kept saying goodbye to friends, with whom we spent almost a fortnight together. Athirappilly Falls
Group photo with all DebConf T-shirts chronologically
This was 2nd trip to Kochi. Vistara Airway s UK886 has become the default flight now. I have almost learned how to travel in and around Kochi by Metro, Water Metro, Airport Shuttle and auto. Things are quite accessible in Kochi but metro is a bit expensive compared to Delhi. I left Kochi on 19th. My flight was due to leave around 8 PM, so I had the whole day to myself. A direct option would have taken less than 1 hour, but as I had time and chose to take the long way to the airport. First took an auto rickshaw to Kakkanad Water Metro station. Then sailed in the water metro to Vyttila Water Metro station. Vyttila serves as intermobility hub which connects water metro, metro, bus at once place. I switched to Metro here at Vyttila Metro station till Aluva Metro station. Here, I had lunch and then boarded the Airport feeder bus to reach Kochi Airport. All in all, I did auto rickshaw > water metro > metro > feeder bus to reach Airport. I was fun and scenic. I must say, public transport and intermodal integration is quite good and once can transition seamlessly from one mode to next. Kochi Water Metro
Kochi Water Metro

Scenes from Kochi Water Metro Scenes from Kochi Water Metro
Scenes from Kochi Water Metro
DebConf23 served its purpose of getting existing Debian people together, as well as getting new people interested and contributing to Debian. People who came are still contributing to Debian, and that s amazing. Streaming video stats
Streaming video stats. Screengrab from closing ceremony
The conference wasn t without its fair share of troubles. There were multiple money transfer woes, and being in India didn t help. Many thanks to multiple organizations who were proactive in helping out. On top of this, there was conference visa uncertainty and other issues which troubled visa team a lot. Kudos to everyone who made this possible. Surely, I m going to miss the name, so thank you for it, you know how much you have done to make this event possible. Now, DebConf24 is scheduled for Busan, South Korea, and work is already in full swing. As usual, I m helping with the fundraising part and plan to attend too. Let s see if I can make it or not. DebConf23 Group Photo
DebConf23 Group Photo. Click to enlarge.
Credits - Aigars Mahinovs
In the end, we kept on saying, no DebConf at this scale would come back to India for the next 10 or 20 years. It s too much trouble to be frank. It was probably the peak that we might not reach again. I would be happy to be proven wrong though :)

Next.