Search Results: "lowe"

12 November 2024

Paul Tagliamonte: Complex for Whom?

In basically every engineering organization I ve ever regarded as particularly high functioning, I ve sat through one specific recurring conversation which is not a conversation about complexity . Things are good or bad because they are or aren t complex, architectures needs to be redone because it s too complex some refactor of whatever it is won t work because it s too complex. You may have even been a part of some of these conversations or even been the one advocating for simple light-weight solutions. I ve done it. Many times. Rarely, if ever, do we talk about complexity within its rightful context complexity for whom. Is a solution complex because it s complex for the end user? Is it complex if it s complex for an API consumer? Is it complex if it s complex for the person maintaining the API service? Is it complex if it s complex for someone outside the team maintaining it to understand? Complexity within a problem domain I ve come to believe, is fairly zero-sum there s a fixed amount of complexity in the problem to be solved, and you can choose to either solve it, or leave it for those downstream of you to solve that problem on their own. That being said, while I believe there is a lower bound in complexity to contend with for a problem, I do not believe there is an upper bound to the complexity of solutions possible. It is always possible, and in fact, very likely that teams create problems for themselves while trying to solve a problem. The rest of this post is talking to the lower bound. When getting feedback on an early draft of this blog post, I ve been informed that Fred Brooks coined a term for what I call lower bound complexity Essential Complexity , in the paper No Silver Bullet Essence and Accident in Software Engineering , which is a better term and can be used interchangeably.

Complexity Culture In a large enough organization, where the team is high functioning enough to have and maintain trust amongst peers, members of the team will specialize. People will begin to engage with subsets of the work to be done, and begin to have their efficacy measured against that part of the organization s problems. Incentives shift, and over time it becomes increasingly likely that two engineers may have two very different priorities when working on the same system together. Someone accountable for uptime and tasked with responding to outages will begin to resist changes. Someone accountable for rapidly delivering features will resist gates between them and their users. Companies (either wittingly or unwittingly) will deal with this by tasking engineers with both production (feature development) and operational tasks (maintenance), so the difference in incentives isn t usually as bad as it could be. When we get a bunch of folks from far-flung corners of an organization in a room, fire up a slide deck and throw up some aspirational to-be architecture diagram in order to get a sign-off to solve some problem (be it someone needs a credible promotion packet, new feature needs to get delivered, or the system has begun to fail and needs fixing), the initial reaction will, more often than I d like, start to devolve into a discussion of how this is going to introduce a bunch of complexity, going to be hard to maintain, why can t you make it less complex? Right around here is when I start to try and contextualize the conversation happening around me understand what complexity is that being discussed, and understand who is taking on that burden. Think about who should be owning that problem, and work through the tradeoffs involved. Is it best solved here, or left to consumers (be them other systems, developers, or users). Should something become an API call s optional param, taking on all the edge-cases and on, or should users have to implement the logic using the data you return (leaving everyone else to take on all the edge-cases and maintenance)? Should you process the data, or require the user to preprocess it for you? Frequently it s right to make an active and explicit decision to simplify and leave problems to be solved downstream, since they may not actually need to be solved or perhaps you expect consumers will want to own the specifics of how the problem is solved, in which case you leave lots of documentation and examples. Many other times, especially when it s something downstream consumers are likely to hit, it s best solved internal to the system, since the only thing that can come of leaving it unsolved are bugs, frustration and half-correct solutions. This is a grey-space of tradeoffs, not a clear decision tree. No one wants the software manifestation of a katamari ball or a junk drawer, nor does anyone want a half-baked service unable to handle the simplest use-case.

Head-in-sand as a Service Popoffs about how complex something is, are, to a first approximation, best understood as meaning complicated for the person making comments . A lot of the #thoughtleadership believe that an AWS hosted EKS k8s cluster running images built by CI talking to an AWS hosted PostgreSQL RDS is not complex. They re right. Mostly right. This is less complex less complex for them. It s not, however, without complexity and its own tradeoffs it s just complexity that they do not have to deal with. Now they don t have to maintain machines that have pesky operating systems or hard drive failures. They don t have to deal with updating the version of k8s, nor ensuring the backups work. No one has to push some artifact to prod manually. Deployments happen unattended. You click a button and get a cluster. On the other hand, developers outside the ops function need to deal with troubleshooting CI, debugging access control rules encoded in turing complete YAML, permissions issues inside the cluster due to whatever the fuck a service mesh is, everyone needs to learn how to use some k8s tools they only actually use during a bad day, likely while doing some x.509 troubleshooting to connect to the cluster (an internal only endpoint; just port forward it) not to mention all sorts of rules to route packets to their project (a single repo s binary being run in 3 containers on a single vm host). Beyond that, there s the invisible complexity complexity on the interior of a service you depend on. I think about the dozens of teams maintaining the EKS service (which is either run on EC2 instances, or alternately, EC2 instances in a trench coat, moustache and even more shell scripts), the RDS service (also EC2 and shell scripts, but this time accounting for redundancy, backups, availability zones), scores of hypervisors pulled off the shelf (xen, kvm) smashed together with the ones built in-house (firecracker, nitro, etc) running on hardware that has to be refreshed and maintained continuously. Every request processed by network ACL rules, AWS IAM rules, security group rules, using IP space announced to the internet wired through IXPs directly into ISPs. I don t even want to begin to think about the complexity inherent in how those switches are designed. Shitloads of complexity to solve problems you may or may not have, or even know you had. What s more complex? An app running in an in-house 4u server racked in the office s telco closet in the back running off the office Verizon line, or an app running four hypervisors deep in an AWS datacenter? Which is more complex to you? What about to your organization? In total? Which is more prone to failure? Which is more secure? Is the complexity good or bad? What type of Complexity can you manage effectively? Which threaten the system? Which threaten your users?

COMPLEXIVIBES This extends beyond Engineering. Decisions regarding what tools are we able to use be them existing contracts with cloud providers, CIO mandated SaaS products, a list of the only permissible open source projects will incur costs in terms of expressed complexity . Pinning open source projects to a fixed set makes SBOM production less complex . Using only one SaaS provider s product suite (even if its terrible, because it has all the types of tools you need) makes accreditation less complex . If all you have is a contract with Pauly T s lowest price technically acceptable artisinal cloudary and haberdashery, the way you pay for your compute is less complex for the CIO shop, though you will find yourself building your own hosted database template, mechanism to spin up a k8s cluster, and all the operational and technical burden that comes with it. Or you won t and make it everyone else s problem in the organization. Nothing you can do will solve for the fact that you must now deal with this problem somewhere because it was less complicated for the business to put the workloads on the existing contract with a cut-rate vendor. Suddenly, the decision to reduce complexity because of an existing contract vehicle has resulted in a huge amount of technical risk and maintenance burden being onboarded. Complexity you would otherwise externalize has now been taken on internally. With a large enough organizations (specifically, in this case, i m talking about you, bureaucracies), this is largely ignored or accepted as normal since the personnel cost is understood to be free to everyone involved. Doing it this way is more expensive, more work, less reliable and less maintainable, and yet, somehow, is, in a lot of ways, less complex to the organization. It s particularly bad with bureaucracies, since screwing up a contract will get you into much more trouble than delivering a broken product, leaving basically no reason for anyone to care to fix this. I can t shake the feeling that for every story of technical mandates gone awry, somewhere just out of sight there s a decisionmaker optimizing for what they believe to be the least amount of complexity least hassle, fewest unique cases, most consistency as they can. They freely offload complexity from their accreditation and risk acceptance functions through mandates. They will never have to deal with it. That does not change the fact that someone does.

TC;DR (TOO COMPLEX; DIDN T REVIEW) We wish to rid ourselves of systemic Complexity after all, complexity is bad, simplicity is good. Removing upper-bound own-goal complexity ( accidental complexity in Brooks s terms) is important, but once you hit the lower bound complexity, the tradeoffs become zero-sum. Removing complexity from one part of the system means that somewhere else maybe outside your organization or in a non-engineering function must grow it back. Sometimes, the opposite is the case, such as when a previously manual business processes is automated. Maybe that s a good idea. Maybe it s not. All I know is that what doesn t help the situation is conflating complexity with everything we don t like legacy code, maintenance burden or toil, cost, delivery velocity.
  • Complexity is not the same as proclivity to failure. The most reliable systems I ve interacted with are unimaginably complex, with layers of internal protection to prevent complete failure. This has its own set of costs which other people have written about extensively.
  • Complexity is not cost. Sometimes the cost of taking all the complexity in-house is less, for whatever value of cost you choose to use.
  • Complexity is not absolute. Something simple from one perspective may be wildly complex from another. The impulse to burn down complex sections of code is helpful to have generally, but sometimes things are complicated for a reason, even if that reason exists outside your codebase or organization.
  • Complexity is not something you can remove without introducing complexity elsewhere. Just as not making a decision is a decision itself; choosing to require someone else to deal with a problem rather than dealing with it internally is a choice that needs to be considered in its full context.
Next time you re sitting through a discussion and someone starts to talk about all the complexity about to be introduced, I want to pop up in the back of your head, politely asking what does complex mean in this context? Is it lower bound complexity? Is this complexity desirable? Is what they re saying mean something along the lines of I don t understand the problems being solved, or does it mean something along the lines of this problem should be solved elsewhere? Do they believe this will result in more work for them in a way that you don t see? Should this not solved at all by changing the bounds of what we should accept or redefine the understood limits of this system? Is the perceived complexity a result of a decision elsewhere? Who s taking this complexity on, or more to the point, is failing to address complexity required by the problem leaving it to others? Does it impact others? How specifically? What are you not seeing? What can change? What should change?

11 November 2024

Gunnar Wolf: Why academics under-share research data - A social relational theory

This post is a review for Computing Reviews for Why academics under-share research data - A social relational theory , a article published in Journal of the Association for Information Science and Technology
As an academic, I have cheered for and welcomed the open access (OA) mandates that, slowly but steadily, have been accepted in one way or another throughout academia. It is now often accepted that public funds means public research. Many of our universities or funding bodies will demand that, with varying intensities sometimes they demand research to be published in an OA venue, sometimes a mandate will only prefer it. Lately, some journals and funder bodies have expanded this mandate toward open science, requiring not only research outputs (that is, articles and books) to be published openly but for the data backing the results to be made public as well. As a person who has been involved with free software promotion since the mid 1990s, it was natural for me to join the OA movement and to celebrate when various universities adopt such mandates. Now, what happens after a university or funder body adopts such a mandate? Many individual academics cheer, as it is the right thing to do. However, the authors observe that this is not really followed thoroughly by academics. What can be observed, rather, is the slow pace or feet dragging of academics when they are compelled to comply with OA mandates, or even an outright refusal to do so. If OA and open science are close to the ethos of academia, why aren t more academics enthusiastically sharing the data used for their research? This paper finds a subversive practice embodied in the refusal to comply with such mandates, and explores an hypothesis based on Karl Marx s productive worker theory and Pierre Bourdieu s ideas of symbolic capital. The paper explains that academics, as productive workers, become targets for exploitation: given that it s not only the academics sharing ethos, but private industry s push for data collection and industry-aligned research, they adapt to technological changes and jump through all kinds of hurdles to create more products, in a result that can be understood as a neoliberal productivity measurement strategy. Neoliberalism assumes that mechanisms that produce more profit for academic institutions will result in better research; it also leads to the disempowerment of academics as a class, although they are rewarded as individuals due to the specific value they produce. The authors continue by explaining how open science mandates seem to ignore the historical ways of collaboration in different scientific fields, and exploring different angles of how and why data can be seen as under-shared, failing to comply with different aspects of said mandates. This paper, built on the social sciences tradition, is clearly a controversial work that can spark interesting discussions. While it does not specifically touch on computing, it is relevant to Computing Reviews readers due to the relatively high percentage of academics among us.

10 November 2024

Reproducible Builds: Reproducible Builds in October 2024

Welcome to the October 2024 report from the Reproducible Builds project. Our reports attempt to outline what we ve been up to over the past month, highlighting news items from elsewhere in tech where they are related. As ever, if you are interested in contributing to the project, please visit our Contribute page on our website. Table of contents:
  1. Beyond bitwise equality for Reproducible Builds?
  2. Two Ways to Trustworthy at SeaGL 2024
  3. Number of cores affected Android compiler output
  4. On our mailing list
  5. diffoscope
  6. IzzyOnDroid passed 25% reproducible apps
  7. Distribution work
  8. Website updates
  9. Reproducibility testing framework
  10. Supply-chain security at Open Source Summit EU
  11. Upstream patches

Beyond bitwise equality for Reproducible Builds? Jens Dietrich, Tim White, of Victoria University of Wellington, New Zealand along with Behnaz Hassanshahi and Paddy Krishnan of Oracle Labs Australia published a paper entitled Levels of Binary Equivalence for the Comparison of Binaries from Alternative Builds :
The availability of multiple binaries built from the same sources creates new challenges and opportunities, and raises questions such as: Does build A confirm the integrity of build B? or Can build A reveal a compromised build B? . To answer such questions requires a notion of equivalence between binaries. We demonstrate that the obvious approach based on bitwise equality has significant shortcomings in practice, and that there is value in opting for alternative notions. We conceptualise this by introducing levels of equivalence, inspired by clone detection types.
A PDF of the paper is freely available.

Two Ways to Trustworthy at SeaGL 2024 On Friday 8th November, Vagrant Cascadian will present a talk entitled Two Ways to Trustworthy at SeaGL in Seattle, WA. Founded in 2013, SeaGL is a free, grassroots technical summit dedicated to spreading awareness and knowledge about free source software, hardware and culture. Vagrant s talk:
[ ] delves into how two project[s] approaches fundamental security features through Reproducible Builds, Bootstrappable Builds, code auditability, etc. to improve trustworthiness, allowing independent verification; trustworthy projects require little to no trust. Exploring the challenges that each project faces due to very different technical architectures, but also contextually relevant social structure, adoption patterns, and organizational history should provide a good backdrop to understand how different approaches to security might evolve, with real-world merits and downsides.

Number of cores affected Android compiler output Fay Stegerman wrote that the cause of the Android toolchain bug from September s report that she reported to the Android issue tracker has been found and the bug has been fixed.
the D8 Java to DEX compiler (part of the Android toolchain) eliminated a redundant field load if running the class s static initialiser was known to be free of side effects, which ended up accidentally depending on the sharding of the input, which is dependent on the number of CPU cores used during the build.
To make it easier to understand the bug and the patch, Fay also made a small example to illustrate when and why the optimisation involved is valid.

On our mailing list On our mailing list this month:

diffoscope diffoscope is our in-depth and content-aware diff utility that can locate and diagnose reproducibility issues. This month, Chris Lamb made the following changes, including preparing and uploading versions 279, 280, 281 and 282 to Debian:
  • Ignore errors when listing .ar archives (#1085257). [ ]
  • Don t try and test with systemd-ukify in the Debian stable distribution. [ ]
  • Drop Depends on the deprecated python3-pkg-resources (#1083362). [ ]
In addition, Jelle van der Waa added support for Unified Kernel Image (UKI) files. [ ][ ][ ] Furthermore, Vagrant Cascadian updated diffoscope in GNU Guix to version 282. [ ][ ]

IzzyOnDroid passed 25% reproducible apps The IzzyOnDroid project has reached a good milestone by reaching over 25% of the ~1,200 Android apps provided by their repository (of official APKs built by the original application developers) having been confirmed to be reproducible by a rebuilder.

Distribution work In Debian this month:
  • Holger Levsen uploaded devscripts version 2.24.2, including many changes to the debootsnap, debrebuild and reproducible-check scripts. This is the first time that debrebuild actually works (using sbuild s unshare backend). As part of this, Holger also fixed an issue in the reproducible-check script where a typo in the code led to incorrect results [ ]
  • Recently, a news entry was added to snapshot.debian.org s homepage, describing the recent changes that made the system stable again:
    The new server has no problems keeping up with importing the full archives on every update, as each run finishes comfortably in time before it s time to run again. [While] the new server is the one doing all the importing of updated archives, the HTTP interface is being served by both the new server and one of the VM s at LeaseWeb.
    The entry list a number of specific updates surrounding the API endpoints and rate limiting.
  • Lastly, 12 reviews of Debian packages were added, 3 were updated and 18 were removed this month adding to our knowledge about identified issues.
Elsewhere in distribution news, Zbigniew J drzejewski-Szmek performed another rebuild of Fedora 42 packages, with the headline result being that 91% of the packages are reproducible. Zbigniew also reported a reproducibility problem with QImage. Finally, in openSUSE, Bernhard M. Wiedemann published another report for that distribution.

Website updates There were an enormous number of improvements made to our website this month, including:
  • Alba Herrerias:
    • Improve consistency across distribution-specific guides. [ ]
    • Fix a number of links on the Contribute page. [ ]
  • Chris Lamb:
  • hulkoba
  • James Addison:
    • Huge and significant work on a (as-yet-merged) quickstart guide to be linked from the homepage [ ][ ][ ][ ][ ]
    • On the homepage, link directly to the Projects subpage. [ ]
    • Relocate dependency-drift notes to the Volatile inputs page. [ ]
  • Ninette Adhikari:
    • Add a brand new Success stories page that highlights the success stories of Reproducible Builds, showcasing real-world examples of projects shipping with verifiable, reproducible builds . [ ][ ][ ][ ][ ][ ]
  • Pol Dellaiera:
    • Update the website s README page for building the website under NixOS. [ ][ ][ ][ ][ ]
    • Add a new academic paper citation. [ ]
Lastly, Holger Levsen filed an extensive issue detailing a request to create an overview of recommendations and standards in relation to reproducible builds.

Reproducibility testing framework The Reproducible Builds project operates a comprehensive testing framework running primarily at tests.reproducible-builds.org in order to check packages and other artifacts for reproducibility. In October, a number of changes were made by Holger Levsen, including:
  • Add a basic index.html for rebuilderd. [ ]
  • Update the nginx.conf configuration file for rebuilderd. [ ]
  • Document how to use a rescue system for Infomaniak s OpenStack cloud. [ ]
  • Update usage info for two particular nodes. [ ]
  • Fix up a version skew check to fix the name of the riscv64 architecture. [ ]
  • Update the rebuilderd-related TODO. [ ]
In addition, Mattia Rizzolo added a new IP address for the inos5 node [ ] and Vagrant Cascadian brought 4 virt nodes back online [ ].

Supply-chain security at Open Source Summit EU The Open Source Summit EU took place recently, and covered plenty of topics related to supply-chain security, including:

Upstream patches The Reproducible Builds project detects, dissects and attempts to fix as many currently-unreproducible packages as possible. We endeavour to send all of our patches upstream where appropriate. This month, we wrote a large number of such patches, including:

Finally, If you are interested in contributing to the Reproducible Builds project, please visit our Contribute page on our website. However, you can get in touch with us via:

8 November 2024

Thomas Lange: Using NIS (Network Information Service) in 2024

The topic of this posting already tells you that an old Unix guy tells stories about old techniques. I'm a happy NIS (formerly YP) user since 30+ years. I started using it with SunOS 4.0, later using it with Solaris and with Linux since 1999. In the past, a colleague wasn't happyly using NIS+ when he couldn't log in as root after a short time because of some well known bugs and wrong configs. NIS+ was also much slower than my NIS setup. I know organisations using NIS for more than 80.000 user accounts in 2024. I know the security implications of NIS but I can live with them, because I manage all computers in the network that have access to the NIS maps. And NIS on Linux offers to use shadow maps, which are only accessible to the root account. My users are forced to use very long passwords. Unfortunately NIS support for the PAM modules was removed in Debian in pam 1.4.0-13, which means Debian 12 (bookworm) is lacking NIS support in PAM, but otherwise it is still supported. This only affects changing the NIS password via passwd. You can still authenticate users and use other NIS maps. But yppasswd is deprecated and you should not use it! If you use yppasswd it may generate a new password hash by using the old DES crypt algorithm, which is very weak and only uses the first 8 chars in your password. Do not use yppasswd any more! yppasswd only detects DES, MD5, SHA256 and SHA512 hashes, but for me and some colleagues it only creates weak DES hashes after a password change. yescrypt hashes which are the default in Debian 12 are not supported at all. The solution is to use the plain passwd program. On the NIS master, you should setup your NIS configuration to use /etc/shadow and /etc/passwd even if your other NIS maps are in /var/yp/src or similar. Make sure to have these lines in your /var/yp/Makefile:
PASSWD      = /etc/passwd
SHADOW      = /etc/shadow
Call make once, and it will generate the shadow and passwd map. You may want to set the variable MINUID which defines which entries are not put into the NIS maps. On all NIS clients you still need the entries (for passwd, shadow, group,...) that point to the nis service. E.g.:
passwd:         files nis systemd
group:          files nis systemd
shadow:         files nis
You can remove all occurences of "nis" in your /etc/pam.d/common-password file. Then you can use the plain passwd program to change your password on the NIS master. But this does not call make in /var/yp for updating the NIS shadow map. Let's use inotify(7) for that. First, create a small shell script /usr/local/sbin/shadow-change:
#! /bin/sh
PATH=/usr/sbin:/usr/bin
# only watch the /etc/shadow file
if [ "$2" != "shadow" ]; then
  exit 0
fi
cd /var/yp   exit 3
sleep 2
make
Then install the package incron.
# apt install incron
# echo root >> /etc/incron.allow
# incrontab -e
Add this line:
/etc    IN_MOVED_TO     /usr/local/sbin/shadow-change $@ $# $%
It's not possible to use IN_MODIFY or watch other events on /etc/shadow directly, because the passwd command creates a /etc/nshadow file, deletes /etc/shadow and then moves nshadow to shadow. inotify on a file does not work after the file was removed. You can see the logs from incrond by using:
# journalctl _COMM=incrond
e.g.
Oct 01 12:21:56 kueppers incrond[6588]: starting service (version 0.5.12, built on Jan 27 2023 23:08:49)
Oct 01 13:43:55 kueppers incrond[6589]: table for user root created, loading
Oct 01 13:45:42 kueppers incrond[6589]: PATH (/etc) FILE (shadow) EVENT (IN_MOVED_TO)
Oct 01 13:45:42 kueppers incrond[6589]: (root) CMD ( /usr/local/sbin/shadow-change /etc shadow IN_MOVED_TO)
I've disabled the execution of yppasswd using dpkg-divert
# dpkg-divert --local --rename --divert /usr/bin/yppasswd-disable /usr/bin/yppasswd
chmod a-rwx /usr/bin/yppasswd-disable
Do not forget to limit the access to the shadow.byname map in ypserv.conf and general access to NIS in ypserv.securenets. I've also discovered the package pamtester, which is a nice package for testing your pam configs.

7 November 2024

Jonathan Dowland: John Carpenter's "The Fog"

'The Fog' 7 inch vinyl record
A gift from my brother. Coincidentally I ve had John Carpenter s Halloween echoing around my my head for weeks: I ve been deconstructing it and trying to learn to play it.

6 November 2024

Bits from Debian: Bits from the DPL

Dear Debian community, this is Bits from DPL for October. In addition to a summary of my recent activities, I aim to include newsworthy developments within Debian that might be of interest to the broader community. I believe this provides valuable insights and foster a sense of connection across our diverse projects. Also, I welcome your feedback on the format and focus of these Bits, as community input helps shape their value. Ada Lovelace Day 2024 As outlined in my platform, I'm committed to increasing the diversity of Debian developers. I hope the recent article celebrating Ada Lovelace Day 2024 featuring interviews with women in Debian will serve as an inspiring motivation for more women to join our community. MiniDebConf Cambridge This was my first time attending the MiniDebConf in Cambridge, hosted at the ARM building. I thoroughly enjoyed the welcoming atmosphere of both MiniDebCamp and MiniDebConf. It was wonderful to reconnect with people who hadn't made it to the last two DebConfs, and, as always, there was plenty of hacking, insightful discussions, and valuable learning. If you missed the recent MiniDebConf, there's a great opportunity to attend the next one in Toulouse. It was recently decided to include a MiniDebCamp beforehand as well. FTPmaster accepts MRs for DAK At the recent MiniDebConf in Cambridge, I discussed potential enhancements for DAK to make life easier for both FTP Team members and developers. For those interested, the document "Hacking on DAK" provides guidance on setting up a local DAK instance and developing patches, which can be submitted as MRs. As a perfectly random example of such improvements some older MR, "Add commands to accept/reject updates from a policy queue" might give you some inspiration. At MiniDebConf, we compiled an initial list of features that could benefit both the FTP Team and the developer community. While I had preliminary discussions with the FTP Team about these items, not all ideas had consensus. I aim to open a detailed, public discussion to gather broader feedback and reach a consensus on which features to prioritize. Sometimes, packages are rejected not because of DFSG-incompatible licenses but due to other issues that could be resolved within an existing package (as discussed in my DebConf23 BoF, "Chatting with ftpmasters"[1]). During the "Meet the ftpteam" BoF (Log/transcription of the BoF can be found here), for the moment until the MR gets accepted, a new option was proposed for FTP Team members reviewing packages in NEW:

Accept + Bug Report This option would allow a package to enter Debian (in unstable or experimental) with an automatically filed RC bug report. The RC bug would prevent the package from migrating to testing until the issues are addressed. To ensure compatibility with the BTS, which only accepts bug reports for existing packages, a delayed job (24 hours post-acceptance) would file the bug.

When binary package names change, currently the package must go through the NEW queue, which can delay the availability of updated libraries. Allowing such packages to bypass the queue could expedite this process. A configuration option to enable this bypass specifically for uploads to experimental may be useful, as it avoids requiring additional technical review for experimental uploads. Previously, I believed the requirement for binary name changes to pass through NEW was due to a missing feature in DAK, possibly addressable via an MR. However, in discussions with the FTP Team, I learned this is a matter of team policy rather than technical limitation. I haven't found this policy documented, so it may be worth having a community discussion to clarify and reach consensus on how we want to handle binary name changes to get the MR sensibly designed. When a developer requests the removal of a package whether entirely or for specific architectures RM bugs must be filed for the package itself as well as for each package depending on it. It would be beneficial if the dependency tree could be automatically resolved, allowing either:
a) the DAK removal tooling to remove the entire dependency tree
   after prompting the bug report author for confirmation, or
b) the system to auto-generate corresponding bug reports for all
   packages in the dependency tree.
The latter option might be better suited for implementation in an MR for reportbug. However, given the possibility of large-scale removals (for example, targeting specific architectures), having appropriate tooling for this would be very beneficial. In my opinion the proposed DAK enhancements aim to support both FTP Team members and uploading developers. I'd be very pleased if these ideas spark constructive discussion and inspire volunteers to start working on them--possibly even preparing to join the FTP Team. On the topic of ftpmasters: an ongoing discussion with SPI lawyers is currently reviewing the non-US agreement established 22 years ago. Ideally, this review will lead to a streamlined workflow for ftpmasters, removing certain hurdles that were originally put in place due to legal requirements, which were updated in 2021. Contacting teams My outreach efforts to Debian teams have slowed somewhat recently. However, I want to emphasize that anyone from a packaging team is more than welcome to reach out to me directly. My outreach emails aren't following any specific orders--just my own somewhat na ve view of Debian, which I'm eager to make more informed. Recently, I received two very informative responses: one from the Qt/KDE Team, which thoughtfully compiled input from several team members into a shared document. The other was from the Rust Team, where I received three quick, helpful replies one of which included an invitation to their upcoming team meeting. Interesting readings on our mailing lists I consider the following threads on our mailing list some interesting reading and would like to add some comments. Sensible languages for younger contributors Though the discussion on debian-devel about programming languages took place in September, I recently caught up with it. I strongly believe Debian must continue evolving to stay relevant for the future. "Everything must change, so that everything can stay the same." -- Giuseppe Tomasi di Lampedusa, The Leopard I encourage constructive discussions on integrating programming languages in our toolchain that support this evolution. Concerns regarding the "Open Source AI Definition" A recent thread on the debian-project list discussed the "Open Source AI Definition". This topic will impact Debian in the future, and we need to reach an informed decision. I'd be glad to see more perspectives in the discussions particularly on finding a sensible consensus, understanding how FTP Team members view their delegated role, and considering whether their delegation might need adjustments for clarity on this issue. Kind regards Andreas.

4 November 2024

Ravi Dwivedi: Asante Kenya for a Good Time

In September of this year, I visited Kenya to attend the State of the Map conference. I spent six nights in the capital Nairobi, two nights in Mombasa, and one night on a train. I was very happy with the visa process being smooth and quick. Furthermore, I stayed at the Nairobi Transit Hotel with other attendees, with Ibtehal from Bangladesh as my roommate. One of the memorable moments was the time I spent at a local coffee shop nearby. We used to go there at midnight, despite the grating in the shops suggesting such adventures were unsafe. Fortunately, nothing bad happened, and we were rewarded with a fun time with the locals.
The coffee shop Ibtehal and me used to visit during the midnight
Grating at a chemist shop in Mombasa, Kenya
The country lies on the equator, which might give the impression of extremely hot temperatures. However, Nairobi was on the cooler side (10 25 degrees Celsius), and I found myself needing a hoodie, which I bought the next day. It also served as a nice souvenir, as it had an outline of the African map printed on it. I also bought a Safaricom SIM card for 100 shillings and recharged it with 1000 shillings for 8 GB internet with 5G speeds and 400 minutes talk time.

A visit to Nairobi s Historic Cricket Ground On this trip, I got a unique souvenir that can t be purchased from the market a cricket jersey worn in an ODI match by a player. The story goes as follows: I was roaming around the market with my friend Benson from Nairobi to buy a Kenyan cricket jersey for myself, but we couldn t find any. So, Benson had the idea of visiting the Nairobi Gymkhana Club, which used to be Kenya s main cricket ground. It has hosted some historic matches, including the 2003 World Cup match in which Kenya beat the mighty Sri Lankans and the record for the fastest ODI century by Shahid Afridi in just 37 balls in 1996. Although entry to the club was exclusively for members, I was warmly welcomed by the staff. Upon reaching the cricket ground, I met some Indian players who played in Kenyan leagues, as well as Lucas Oluoch and Dominic Wesonga, who have represented Kenya in ODIs. When I expressed interest in getting a jersey, Dominic agreed to send me pictures of his jersey. I liked his jersey and collected it from him. I gave him 2000 shillings, an amount suggested by those Indian players.
Me with players at the Nairobi Gymkhana Club
Cricket pitch at the Nairobi Gymkhana Club
A view of the cricket ground inside the Nairobi Gymkhana Club
Scoreboard at the Nairobi Gymkhana cricket ground

Giraffe Center in Nairobi Kenya is known for its safaris and has no shortage of national parks. In fact, Nairobi is the only capital in the world with a national park. I decided not to visit one, as most of them were expensive and offered multi-day tours, and I didn t want to spend that much time in the wildlife. Instead, I went to the Giraffe Center in Nairobi with Pragya and Rabina. The ticket cost 1500 Kenyan shillings (1000 Indian rupees). In Kenya, matatus - shared vans, usually decorated with portraits of famous people and play rap songs - are the most popular means of public transport. Reaching the Giraffe Center from our hotel required taking five matatus, which cost a total of 150 shillings, and a 2 km walk. The journey back was 90 shillings, suggesting that we didn t find the most efficient route to get there. At the Giraffe Center, we fed giraffes and took photos.
A matatu with a Notorious BIG portrait.
Inside the Giraffe Center

Train ride from Nairobi to Mombasa I took a train from Nairobi to Mombasa. The train is known as the SGR Train, where SGR refers to Standard Gauge Railway. The journey was around 500 km. M-Pesa was the only way to make payment for pre-booking the train ticket, and I didn t have an M-Pesa account. Pragya s friend Mary helped facilitate the payment. I booked a second-class ticket, which cost 1500 shillings (1000 Indian rupees). The train was scheduled to depart from Nairobi at 08:00 hours in the morning and arrive in Mombasa at 14:00 hours. The security check at the station required scanning our bags and having them sniffed by sniffer dogs. I also fell victim to a scam by a security official who offered to help me get my ticket printed, only to later ask me to get him some coffee, which I politely declined. Before boarding the train, I was treated to some stunning views at the Nairobi Terminus station. It was a seating train, but I wished it were a sleeper train, as I was sleep-deprived. The train was neat and clean, with good toilets. The train reached Mombasa on time at around 14:00 hours.
SGR train at Nairobi Terminus.
Interior of the SGR train

Arrival in Mombasa
Mombasa Terminus station.
Mombasa was a bit hotter than Nairobi, with temperatures reaching around 30 degrees Celsius. However, that s not too hot for me, as I am used to higher temperatures in India. I had booked a hostel in the Old Town and was searching for a hitchhike from the Mombasa Terminus station. After trying for more than half an hour, I took a matatu that dropped me 3 km from my hostel for 200 shillings (140 Indian rupees). I tried to hitchhike again but couldn t find a ride. I think I know why I couldn t get a ride in both the cases. In the first case, the Mombasa Terminus was in an isolated place, so most of the vehicles were taxis or matatus while any noncommercial cars were there to pick up friends and family. If the station were in the middle of the city, there would be many more car/truck drivers passing by, thus increasing my possibilities of getting a ride. In the second case, my hostel was at the end of the city, and nobody was going towards that side. In fact, many drivers told me they would love to give me a ride, but they were going in some other direction. Finally, I took a tuktuk for 70 shillings to reach my hostel, Tulia Backpackers. It was 11 USD (1400 shillings) for one night. The balcony gave a nice view of the Indian Ocean. The rooms had fans, but there was no air conditioning. Each bed also had mosquito nets. The place was walking distance of the famous Fort Jesus. Mombasa has had more Islamic influence compared to Nairobi and also has many Hindu temples.
The balcony at Tulia Backpackers Hostel had a nice view of the ocean.
A room inside the hostel with fans and mosquito nets on the beds

Visiting White Sandy Beaches and Getting a Hitchhike Visiting Nyali beach marked my first time ever at a white sand beach. It was like 10 km from the hostel. The next day, I visited Diani Beach, which was 30 km from the hostel. Going to Diani Beach required crossing a river, for which there s a free ferry service every few minutes, followed by taking a matatu to Ukunda and then a tuk-tuk. The journey gave me a glimpse of the beautiful countryside of Kenya.
Nyali beach is a white sand beach
This is the ferry service for crossing the river.
During my return from Diani Beach to the hostel, I was successful in hitchhiking. However, it was only a 4 km ride and not sufficient to reach Ukunda, so I tried to get another ride. When a truck stopped for me, I asked for a ride to Ukunda. Later, I learned that they were going in the same direction as me, so I got off within walking distance from my hostel. The ride was around 30 km. I also learned the difference between a truck ride and a matatu or car ride. For instance, matatus and cars are much faster and cooler due to air conditioning, while trucks tend to be warmer because they lack it. Further, the truck was stopped at many checkpoints by the police for inspections as it carried goods, which is not the case with matatus. Anyways, it was a nice experience, and I am grateful for the ride. I had a nice conversation with the truck drivers about Indian movies and my experiences in Kenya.
Diani beach is a popular beach in Kenya. It is a white sand beach.
Selfie with truck drivers who gave me the free ride

Back to Nairobi I took the SGR train from Mombasa back to Nairobi. This time I took the night train, which departs at 22:00 hours, reaching Nairobi at around 04:00 in the morning. I could not sleep comfortably since the train only had seater seats. I had booked the Zarita Hotel in Nairobi and had already confirmed if they allowed early morning check-in. Usually, hotels have a fixed checkout time, say 11:00 in the morning, and you are not allowed to stay beyond that regardless of the time you checked in. But this hotel checked me in for 24 hours. Here, I paid in US dollars, and the cost was 12 USD.

Almost Got Stuck in Kenya Two days before my scheduled flight from Nairobi back to India, I heard the news that the airports in Kenya were closed due to the strikes. Rabina and Pragya had their flight back to Nepal canceled that day, which left them stuck in Nairobi for two additional days. I called Sahil in India and found out during the conversation that the strike was called off in the evening. It was a big relief for me, and I was fortunate to be able to fly back to India without any changes to my plans.
Newspapers at a stand in Kenya covering news on the airport closure

Experience with locals I had no problems communicating with Kenyans, as everyone I met knew English to an extent that could easily surpass that of big cities in India. Additionally, I learned a few words from Kenya s most popular local language, Swahili, such as Asante, meaning thank you, Jambo for hello, and Karibu for welcome. Knowing a few words in the local language went a long way. I am not sure what s up with haggling in Kenya. It wasn t easy to bring the price of souvenirs down. I bought a fridge magnet for 200 shillings, which was the quoted price. On the other hand, it was much easier to bargain with taxis/tuktuks/motorbikes. I stayed at three hotels/hostels in Kenya. None of them had air conditioners. Two of the places were in Nairobi, and they didn t even have fans in the rooms, while the one in Mombasa had only fans. All of them had good Wi-Fi, except Tulia where the internet overall was a bit shaky. My experience with the hotel staff was great. For instance, we requested that the Nairobi Transit Hotel cancel the included breakfast in order to reduce the room costs, but later realized that it was not a good idea. The hotel allowed us to revert and even offered one of our missing breakfasts during dinner. The staff at Tulia Backpackers in Mombasa facilitated the ticket payment for my train from Mombasa to Nairobi. One of the staff members also gave me a lift to the place where I could catch a matatu to Nyali Beach. They even added an extra tea bag to my tea when I requested it to be stronger.

Food At the Nairobi Transit Hotel, a Spanish omelet with tea was served for breakfast. I noticed that Spanish omelette appeared on the menus of many restaurants, suggesting that it is popular in Kenya. This was my first time having this dish. The milk tea in Kenya, referred to by locals as white tea, is lighter than Indian tea (they don t put a lot of tea leaves).
Spanish Omelette served in breakfast at Nairobi Transit Hotel
I also sampled ugali with eggs. In Mombasa, I visited an Indian restaurant called New Chetna and had a buffet thali there twice.
Ugali with eggs.

Tips for Exchanging Money In Kenya, I exchanged my money at forex shops a couple of times. I received good exchange rates for bills larger than 50 USD. For instance, 1 USD on xe.com was 129 shillings, and I got 128.3 shillings per USD (a total of 12,830 shillings) for two 50 USD notes at an exchange in Nairobi, while 127 shillings, which was the highest rate at the banks. On the other hand, for smaller bills such as a one US dollar note, I would have got 125 shillings. A passport was the only document required for the exchange, and they also provided a receipt. A good piece of advice for travelers is to keep 50 USD or larger bills for exchanging into the local currency while saving the smaller US dollar bills for accommodation, as many hotels and hostels accept payment in US dollars (in addition to Kenyan shillings).

Missed Malindi and Lamu There were more places on my to-visit list in Kenya. But I simply didn t have time to cover them, as I don t like rushing through places, especially in a foreign country where there is a chance of me underestimating the amount of time it takes during transit. I would have liked to visit at least one of Kilifi, Watamu or Malindi beaches. Further, Lamu seemed like a unique place to visit as it has no cars or motorized transport; the only options for transport are boats and donkeys. That s it for now. Meet you in the next one :)

Sven Hoexter: Google CloudDNS HTTPS Records with ipv6hint

I naively provisioned an HTTPS record at Google CloudDNS like this via terraform:
resource "google_dns_record_set" "testv6"  
    name         = "testv6.some-domain.example."
    managed_zone = "some-domain-example"
    type         = "HTTPS"
    ttl          = 3600
    rrdatas      = ["1 . alpn=\"h2\" ipv4hint=\"198.51.100.1\" ipv6hint=\"2001:DB8::1\""]
 
This results in a permanent diff because the Google CloudDNS API seems to parse the record content, and stores the ipv6hint expanded (removing the :: notation) and in all lowercase as 2001:db8:0:0:0:0:0:1. Thus to fix the permanent diff we've to use it like this:
resource "google_dns_record_set" "testv6"  
    name = "testv6.some-domain.example."
    managed_zone = "some-domain-example"
    type = "HTTPS"
    ttl = 3600
    rrdatas = ["1 . alpn=\"h2\" ipv4hint=\"198.51.100.1\" ipv6hint=\"2001:db8:0:0:0:0:0:1\""]
 
Guess I should be glad that they already support HTTPS records natively, and not bicker too much about the implementation details.

1 November 2024

Colin Watson: Free software activity in October 2024

Almost all of my Debian contributions this month were sponsored by Freexian. You can also support my work directly via Liberapay. Ansible I noticed that Ansible had fallen out of Debian testing due to autopkgtest failures. This seemed like a problem worth fixing: in common with many other people, we use Ansible for configuration management at Freexian, and it probably wouldn t make our sysadmins too happy if they upgraded to trixie after its release and found that Ansible was gone. The problems here were really just slogging through test failures in both the ansible-core and ansible packages, but their test suites are large and take a while to run so this took some time. I was able to contribute a few small fixes to various upstreams in the process: This should now get back into testing tomorrow. OpenSSH Martin- ric Racine reported that ssh-audit didn t list the ext-info-s feature as being available in Debian s OpenSSH 9.2 packaging in bookworm, contrary to what OpenSSH upstream said on their specifications page at the time. I spent some time looking into this and realized that upstream was mistakenly saying that implementations of ext-info-c and ext-info-s were added at the same time, while in fact ext-info-s was added rather later. ssh-audit now has clearer output, and the OpenSSH maintainers have corrected their specifications page. I looked into a report of an ssh failure in certain cases when using GSS-API key exchange (which is a Debian patch). Once again, having integration tests was a huge win here: the affected scenario is quite a fiddly one, but I was able to set it up in the test, and thereby make sure it doesn t regress in future. It still took me a couple of hours to get all the details right, but in the past this sort of thing took me much longer with a much lower degree of confidence that the fix was correct. On upstream s advice, I cherry-picked some key exchange fixes needed for big-endian architectures. Python team I packaged python-evalidate, needed for a new upstream version of buildbot. The Python 3.13 transition rolls on. I fixed problems related to it in htmlmin, humanfriendly, postgresfixture (contributed upstream), pylint, python-asyncssh (contributed upstream), python-oauthlib, python3-simpletal, quodlibet, zope.exceptions, and zope.interface. A trickier Python 3.13 issue involved the cgi module. Years ago I ported zope.publisher to the multipart module because cgi.FieldStorage was broken in some situations, and as a result I got a recommendation into Python s dead batteries PEP 594. Unfortunately there turns out to be a name conflict between multipart and python-multipart on PyPI; python-multipart upstream has been working to disentangle this, though we still need to work out what to do in Debian. All the same, I needed to fix python-wadllib and multipart seemed like the best fit; I contributed a port upstream and temporarily copied multipart into Debian s python-wadllib source package to allow its tests to pass. I ll come back and fix this properly once we sort out the multipart vs. python-multipart packaging. tzdata moved some timezone definitions to tzdata-legacy, which has broken a number of packages. I added tzdata-legacy build-dependencies to alembic and python-icalendar to deal with this in those packages, though there are still some other instances of this left. I tracked down an nltk regression that caused build failures in many other packages. I fixed Rust crate versioning issues in pydantic-core, python-bcrypt, and python-maturin (mostly fixed by Peter Michael Green and Jelmer Vernoo , but it needed a little extra work). I fixed other build failures in entrypoints, mayavi2, python-pyvmomi (mostly fixed by Alexandre Detiste, but it needed a little extra work), and python-testing.postgresql (ditto). I fixed python3-simpletal to tolerate future versions of dh-python that will drop their dependency on python3-setuptools. I fixed broken symlinks in python-treq. I removed (build-)depends on python3-pkg-resources from alembic, autopep8, buildbot, celery, flufl.enum, flufl.lock, python-public, python-wadllib (contributed upstream), pyvisa, routes, vulture, and zodbpickle (contributed upstream). I upgraded astroid, asyncpg (fixing a Python 3.13 failure and a build failure), buildbot (noticing an upstream test bug in the process), dnsdiag, frozenlist, netmiko (fixing a Python 3.13 failure), psycopg3, pydantic-settings, pylint, python-asyncssh, python-bleach, python-btrees, python-cytoolz, python-django-pgtrigger, python-django-test-migrations, python-gssapi, python-icalendar, python-json-log-formatter, python-pgbouncer, python-pkginfo, python-plumbum, python-stdlib-list, python-tokenize-rt, python-treq (fixing a Python 3.13 failure), python-typeguard, python-webargs (fixing a build failure), pyupgrade, pyvisa, pyvisa-py (fixing a Python 3.13 failure), toolz, twisted, vulture, waitress (fixing CVE-2024-49768 and CVE-2024-49769), wtf-peewee, wtforms, zodbpickle, zope.exceptions, zope.interface, zope.proxy, zope.security, and zope.testrunner to new upstream versions. I tried to fix a regression in python-scruffy, but I need testing feedback. I requested removal of python-testing.mysqld.

14 October 2024

Scarlett Gately Moore: Kubuntu 24.10 Released, KDE Snaps at 24.08.2, and I lived to tell you about it!

Happy 28th birthday KDE!Happy 28th Birthday KDE!
Sorry my blog updates have been MIA. Let me tell you a story As some of you know, 3 months ago I was in a no fault car accident. Thankfully, the only injury was I ended up with a broken arm. ER sends me home in a sling and tells me it was a clean break and it will mend itself in no time. After a week of excruciating pain I went to my follow up doctor appointment, and with my x-rays in hand, the doc tells me it was far from a clean break and needs surgery. So after a week of my shattered bone scraping my nerves and causing pain I have never felt before, I finally go in for surgery! They put in a metal plate with screws to hold the bone in place so it can properly heal. The nerve pain was gone, so I thought I was on the mend. Some time goes by and the swelling still has not subsided, the doctors are not as concerned about this as I am, so I carry on until it becomes really inflamed and developed fever blisters. After no success in reaching the doctors office my husband borrows the neighbors car and rushes me to the ER. Good thing too, I had an infection. So after a 5 day stay in the hospital, they sent us home loaded with antibiotics and trained my husband in wound packing. We did everything right, kept the place immaculate, followed orders with the wound care, took my antibiotics, yet when they ran out there was still no sign of relief, or healing. Went to doctors and they gave me another month supply of antibiotics. Two days after my final dose my arm becomes inflamed again and with extra spectacular levels of pain to go with it. I call the doctor office They said to come in on my appointment day ( 4 days away ). I asked, You aren t concerned with this inflammation? , to which they replied, No. . Ok, maybe I am over reacting and it s all in my head, I can power through 4 more days. The following morning my husband observed fever blisters and the wound site was clearly not right, so once again off we go to the ER. Well thankfully we did. I was in Sepsis and could have died After deliberating with the doctor on the course of action for treatment, the doctor accepted our plea to remove the plate, rather than tighten screws and have me drive 100 miles to hospital everyday for iv antibiotics (Umm I don t have a car!?) So after another 4 day stay I am released into the world, alive and well. I am happy to report, the swelling is almost gone, the pain is minimal, and I am finally healing nicely. I am still in a sling and I have to be super careful and my arm was not fully knitted. So with that I am bummed to say, no traveling for me, no Ubuntu Summit  I still need help with that car, if it weren t for our neighbor, this story would have ended much differently. https://gofund.me/00942f47 Despite my tragic few months for my right arm, my left arm has been quite busy. Thankfully I am a lefty! On to my work progress report. Kubuntu:
With Plasma 6! A big thank you to the Debian KDE/QT team and Rik Mills, could not have done it without you!
KDE Snaps: All release service snaps are done! Save a few problematic ones still WIP.. I have released 24.08.2 which you can find here: https://snapcraft.io/publisher/kde I completed the qt6 and KDE frameworks 6 content packs for core24 Snapcraft: I have a PR in for kde-neon-6 extension core24 support. That s all for now. Thanks for stopping by!

10 October 2024

Freexian Collaborators: Debian Contributions: Packaging Pydantic v2, Reworking of glib2.0 for cross bootstrap, Python archive rebuilds and more! (by Anupa Ann Joseph)

Debian Contributions: 2024-09 Contributing to Debian is part of Freexian s mission. This article covers the latest achievements of Freexian and their collaborators. All of this is made possible by organizations subscribing to our Long Term Support contracts and consulting services.

Pydantic v2, by Colin Watson Pydantic is a useful library for validating data in Python using type hints: Freexian uses it in a number of projects, including Debusine. Its Debian packaging had been stalled at 1.10.17 in testing for some time, partly due to needing to make sure everything else could cope with the breaking changes introduced in 2.x, but mostly due to needing to sort out packaging of its new Rust dependencies. Several other people (notably Alexandre Detiste, Andreas Tille, Drew Parsons, and Timo R hling) had made some good progress on this, but nobody had quite got it over the line and it seemed a bit stuck. Colin upgraded a few Rust libraries to new upstream versions, packaged rust-jiter, and chased various failures in other packages. This eventually allowed getting current versions of both pydantic-core and pydantic into testing. It should now be much easier for us to stay up to date routinely.

Reworking of glib2.0 for cross bootstrap, by Helmut Grohne Simon McVittie (not affiliated with Freexian) earlier restructured the libglib2.0-dev such that it would absorb more functionality and in particular provide tools for working with .gir files. Those tools practically require being run for their host architecture (practically this means running under qemu-user) which is at odds with the requirements of architecture cross bootstrap. The qemu requirement was expressed in package dependencies and also made people unhappy attempting to use libglib2.0-dev for i386 on amd64 without resorting to qemu. The use of qemu in architecture bootstrap is particularly problematic as it tends to not be ready at the time bootstrapping is needed. As a result, Simon proposed and implemented the introduction of a libgio-2.0-dev package providing a subset of libglib2.0-dev that does not require qemu. Packages should continue to use libglib2.0-dev in their Build-Depends unless involved in architecture bootstrap. Helmut reviewed and tested the implementation and integrated the necessary changes into rebootstrap. He also prepared a patch for libverto to use the new package and proposed adding forward compatibility to glib2.0. Helmut continued working on adding cross-exe-wrapper to architecture-properties and implemented autopkgtests later improved by Simon. The cross-exe-wrapper package now provides a generic mechanism to a program on a different architecture by using qemu when needed only. For instance, a dependency on cross-exe-wrapper:i386 provides a i686-linux-gnu-cross-exe-wrapper program that can be used to wrap an ELF executable for the i386 architecture. When installed on amd64 or i386 it will skip installing or running qemu, but for other architectures qemu will be used automatically. This facility can be used to support cross building with targeted use of qemu in cases where running host code is unavoidable as is the case for GObject introspection. This concludes the joint work with Simon and Niels Thykier on glib2.0 and architecture-properties resolving known architecture bootstrap regressions arising from the glib2.0 refactoring earlier this year.

Analyzing binary package metadata, by Helmut Grohne As Guillem Jover (not affiliated with Freexian) continues to work on adding metadata tracking to dpkg, the question arises how this affects existing packages. The dedup.debian.net infrastructure provides an easy playground to answer such questions, so Helmut gathered file metadata from all binary packages in unstable and performed an explorative analysis. Some results include: Guillem also performed a cursory analysis and reported other problem categories such as mismatching directory permissions for directories installed by multiple packages and thus gained a better understanding of what consistency checks dpkg can enforce.

Python archive rebuilds, by Stefano Rivera Last month Stefano started to write some tooling to do large-scale rebuilds in debusine, starting with finding packages that had already started to fail to build from source (FTBFS) due to the removal of setup.py test. This month, Stefano did some more rebuilds, starting with experimental versions of dh-python. During the Python 3.12 transition, we had added a dependency on python3-setuptools to dh-python, to ease the transition. Python 3.12 removed distutils from the stdlib, but many packages were expecting it to still be available. Setuptools contains a version of distutils, and dh-python was a convenient place to depend on setuptools for most package builds. This dependency was never meant to be permanent. A rebuild without it resulted in mass-filing about 340 bugs (and around 80 more by mistake). A new feature in Python 3.12, was to have unittest s test runner exit with a non-zero return code, if no tests were run. We added this feature, to be able to detect tests that are not being discovered, by mistake. We are ignoring this failure, as we wouldn t want to suddenly cause hundreds of packages to fail to build, if they have no tests. Stefano did a rebuild to see how many packages were affected, and found that around 1000 were. The Debian Python community has not come to a conclusion on how to move forward with this. As soon as Python 3.13 release candidate 2 was available, Stefano did a rebuild of the Python packages in the archive against it. This was a more complex rebuild than the others, as it had to be done in stages. Many packages need other Python packages at build time, typically to run tests. So transitions like this involve some manual bootstrapping, followed by several rounds of builds. Not all packages could be tested, as not all their dependencies support 3.13 yet. The result was around 100 bugs in packages that need work to support Python 3.13. Many other packages will need additional work to properly support Python 3.13, but being able to build (and run tests) is an important first step.

Miscellaneous contributions
  • Carles prepared the update of python-pyaarlo package to a new upstream release.
  • Carles worked on updating python-ring-doorbell to a new upstream release. Unfinished, pending to package a new dependency python3-firebase-messaging RFP #1082958 and its dependency python3-http-ece RFP #1083020.
  • Carles improved po-debconf-manager. Main new feature is that it can open Salsa merge requests. Aiming for a lightning talk in MiniDebConf Toulouse (November) to be functional end to end and get feedback from the wider public for this proof of concept.
  • Carles helped one translator to use po-debconf-manager (added compatibility for bullseye, fixed other issues) and reviewed 17 package templates.
  • Colin upgraded the OpenSSH packaging to 9.9p1.
  • Colin upgraded the various YubiHSM packages to new upstream versions, enabled more tests, fixed yubihsm-shell build failures on some 32-bit architectures, made yubihsm-shell build reproducibly, and fixed yubihsm-connector to apply udev rules to existing devices when the package is installed. As usual, bookworm-backports is up to date with all these changes.
  • Colin fixed quite a bit of fallout from setuptools 72.0.0 removing setup.py test, backported a large upstream patch set to make buildbot work with SQLAlchemy 2.0, and upgraded 25 other Python packages to new upstream versions.
  • Enrico worked with Jakob Haufe to get him up to speed for managing sso.debian.org
  • Rapha l did remove spam entries in the list of teams on tracker.debian.org (see #1080446), and he applied a few external contributions, fixing a rendering issue and replacing the DDPO link with a more useful alternative. He also gave feedback on a couple of merge requests that required more work. As part of the analysis of the underlying problem, he suggested to the ftpmasters (via #1083068) to auto-reject packages having the too-many-contacts lintian error, and he raised the severity of #1076048 to serious to actually have that 4 year old bug fixed.
  • Rapha l uploaded zim and hamster-time-tracker to fix issues with Python 3.12 getting rid of setuptools. He also uploaded a new gnome-shell-extension-hamster to cope with the upcoming transition to GNOME 47.
  • Helmut sent seven patches and sponsored one upload for cross build failures.
  • Helmut uploaded a Nagios/Icinga plugin check-smart-attributes for monitoring the health of physical disks.
  • Helmut collaborated on sbuild reviewing and improving a MR for refactoring the unshare backend.
  • Helmut sent a patch fixing coinstallability of gcc-defaults.
  • Helmut continued to monitor the evolution of the /usr-move. With more and more key packages such as libvirt or fuse3 fixed. We re moving into the boring long-tail of the transition.
  • Helmut proposed updating the meson buildsystem in debhelper to use env2mfile.
  • Helmut continued to update patches maintained in rebootstrap. Due to the work on glib2.0 above, rebootstrap moves a lot further, but still fails for any architecture.
  • Santiago reviewed some Merge Request in Salsa CI, such as: !478, proposed by Otto to extend the information about how to use additional runners in the pipeline and !518, proposed by Ahmed to add support for Ubuntu images, that will help to test how some debian packages, including the complex MariaDB are built on Ubuntu. Santiago also prepared !545, which will make the reprotest job more consistent with the result seen on reproducible-builds.
  • Santiago worked on different tasks related to DebConf 25. Especially he drafted the fundraising brochure (which is almost ready).
  • Thorsten Alteholz uploaded package libcupsfilter to fix the autopkgtest and a dependency problem of this package. After package splix was abandoned by upstream and OpenPrinting.org adopted its maintenance, Thorsten uploaded their first release.
  • Anupa published posts on the Debian Administrators group in LinkedIn and moderated the group, one of the tasks of the Debian Publicity Team.
  • Anupa helped organize DebUtsav 2024. It had over 100 attendees with hand-on sessions on making initial contributions to Linux Kernel, Debian packaging, submitting documentation to Debian wiki and assisting Debian Installations.

7 October 2024

Reproducible Builds: Reproducible Builds in September 2024

Welcome to the September 2024 report from the Reproducible Builds project! Our reports attempt to outline what we ve been up to over the past month, highlighting news items from elsewhere in tech where they are related. As ever, if you are interested in contributing to the project, please visit our Contribute page on our website. Table of contents:
  1. New binsider tool to analyse ELF binaries
  2. Unreproducibility of GHC Haskell compiler 95% fixed
  3. Mailing list summary
  4. Towards a 100% bit-for-bit reproducible OS
  5. Two new reproducibility-related academic papers
  6. Distribution work
  7. diffoscope
  8. Other software development
  9. Android toolchain core count issue reported
  10. New Gradle plugin for reproducibility
  11. Website updates
  12. Upstream patches
  13. Reproducibility testing framework

New binsider tool to analyse ELF binaries Reproducible Builds developer Orhun Parmaks z has announced a fantastic new tool to analyse the contents of ELF binaries. According to the project s README page:
Binsider can perform static and dynamic analysis, inspect strings, examine linked libraries, and perform hexdumps, all within a user-friendly terminal user interface!
More information about Binsider s features and how it works can be found within Binsider s documentation pages.

Unreproducibility of GHC Haskell compiler 95% fixed A seven-year-old bug about the nondeterminism of object code generated by the Glasgow Haskell Compiler (GHC) received a recent update, consisting of Rodrigo Mesquita noting that the issue is:
95% fixed by [merge request] !12680 when -fobject-determinism is enabled. [ ]
The linked merge request has since been merged, and Rodrigo goes on to say that:
After that patch is merged, there are some rarer bugs in both interface file determinism (eg. #25170) and in object determinism (eg. #25269) that need to be taken care of, but the great majority of the work needed to get there should have been merged already. When merged, I think we should close this one in favour of the more specific determinism issues like the two linked above.

Mailing list summary On our mailing list this month:
  • Fay Stegerman let everyone know that she started a thread on the Fediverse about the problems caused by unreproducible zlib/deflate compression in .zip and .apk files and later followed up with the results of her subsequent investigation.
  • Long-time developer kpcyrd wrote that there has been a recent public discussion on the Arch Linux GitLab [instance] about the challenges and possible opportunities for making the Linux kernel package reproducible , all relating to the CONFIG_MODULE_SIG flag. [ ]
  • Bernhard M. Wiedemann followed-up to an in-person conversation at our recent Hamburg 2024 summit on the potential presence for Reproducible Builds in recognised standards. [ ]
  • Fay Stegerman also wrote about her worry about the possible repercussions for RB tooling of Debian migrating from zlib to zlib-ng as reproducibility requires identical compressed data streams. [ ]
  • Martin Monperrus wrote the list announcing the latest release of maven-lockfile that is designed aid building Maven projects with integrity . [ ]
  • Lastly, Bernhard M. Wiedemann wrote about potential role of reproducible builds in combatting silent data corruption, as detailed in a recent Tweet and scholarly paper on faulty CPU cores. [ ]

Towards a 100% bit-for-bit reproducible OS Bernhard M. Wiedemann began writing on journey towards a 100% bit-for-bit reproducible operating system on the openSUSE wiki:
This is a report of Part 1 of my journey: building 100% bit-reproducible packages for every package that makes up [openSUSE s] minimalVM image. This target was chosen as the smallest useful result/artifact. The larger package-sets get, the more disk-space and build-power is required to build/verify all of them.
This work was sponsored by NLnet s NGI Zero fund.

Distribution work In Debian this month, 14 reviews of Debian packages were added, 12 were updated and 20 were removed, all adding to our knowledge about identified issues. A number of issue types were updated as well. [ ][ ] In addition, Holger opened 4 bugs against the debrebuild component of the devscripts suite of tools. In particular:
  • #1081047: Fails to download .dsc file.
  • #1081048: Does not work with a proxy.
  • #1081050: Fails to create a debrebuild.tar.
  • #1081839: Fails with E: mmdebstrap failed to run error.
Last month, an issue was filed to update the Salsa CI pipeline (used by 1,000s of Debian packages) to no longer test for reproducibility with reprotest s build_path variation. Holger Levsen provided a rationale for this change in the issue, which has already been made to the tests being performed by tests.reproducible-builds.org. This month, this issue was closed by Santiago R. R., nicely explaining that build path variation is no longer the default, and, if desired, how developers may enable it again. In openSUSE news, Bernhard M. Wiedemann published another report for that distribution.

diffoscope diffoscope is our in-depth and content-aware diff utility that can locate and diagnose reproducibility issues. This month, Chris Lamb made the following changes, including preparing and uploading version 278 to Debian:
  • New features:
    • Add a helpful contextual message to the output if comparing Debian .orig tarballs within .dsc files without the ability to fuzzy-match away the leading directory. [ ]
  • Bug fixes:
    • Drop removal of calculated os.path.basename from GNU readelf output. [ ]
    • Correctly invert X% similar value and do not emit 100% similar . [ ]
  • Misc:
    • Temporarily remove procyon-decompiler from Build-Depends as it was removed from testing (via #1057532). (#1082636)
    • Update copyright years. [ ]
For trydiffoscope, the command-line client for the web-based version of diffoscope, Chris Lamb also:
  • Added an explicit python3-setuptools dependency. (#1080825)
  • Bumped the Standards-Version to 4.7.0. [ ]

Other software development disorderfs is our FUSE-based filesystem that deliberately introduces non-determinism into system calls to reliably flush out reproducibility issues. This month, version 0.5.11-4 was uploaded to Debian unstable by Holger Levsen making the following changes:
  • Replace build-dependency on the obsolete pkg-config package with one on pkgconf, following a Lintian check. [ ]
  • Bump Standards-Version field to 4.7.0, with no related changes needed. [ ]

In addition, reprotest is our tool for building the same source code twice in different environments and then checking the binaries produced by each build for any differences. This month, version 0.7.28 was uploaded to Debian unstable by Holger Levsen including a change by Jelle van der Waa to move away from the pipes Python module to shlex, as the former will be removed in Python version 3.13 [ ].

Android toolchain core count issue reported Fay Stegerman reported an issue with the Android toolchain where a part of the build system generates a different classes.dex file (and thus a different .apk) depending on the number of cores available during the build, thereby breaking Reproducible Builds:
We ve rebuilt [tag v3.6.1] multiple times (each time in a fresh container): with 2, 4, 6, 8, and 16 cores available, respectively:
  • With 2 and 4 cores we always get an unsigned APK with SHA-256 14763d682c9286ef .
  • With 6, 8, and 16 cores we get an unsigned APK with SHA-256 35324ba4c492760 instead.

New Gradle plugin for reproducibility A new plugin for the Gradle build tool for Java has been released. This easily-enabled plugin results in:
reproducibility settings [being] applied to some of Gradle s built-in tasks that should really be the default. Compatible with Java 8 and Gradle 8.3 or later.

Website updates There were a rather substantial number of improvements made to our website this month, including:

Upstream patches The Reproducible Builds project detects, dissects and attempts to fix as many currently-unreproducible packages as possible. We endeavour to send all of our patches upstream where appropriate. This month, we wrote a large number of such patches, including:

Reproducibility testing framework The Reproducible Builds project operates a comprehensive testing framework running primarily at tests.reproducible-builds.org in order to check packages and other artifacts for reproducibility. In September, a number of changes were made by Holger Levsen, including:
  • Debian-related changes:
    • Upgrade the osuosl4 node to Debian trixie in anticipation of running debrebuild and rebuilderd there. [ ][ ][ ]
    • Temporarily mark the osuosl4 node as offline due to ongoing xfs_repair filesystem maintenance. [ ][ ]
    • Do not warn about (very old) broken nodes. [ ]
    • Add the risc64 architecture to the multiarch version skew tests for Debian trixie and sid. [ ][ ][ ]
    • Mark the virt 32,64 b nodes as down. [ ]
  • Misc changes:
    • Add support for powercycling OpenStack instances. [ ]
    • Update the fail2ban to ban hosts for 4 weeks in total [ ][ ] and take care to never ban our own Jenkins instance. [ ]
In addition, Vagrant Cascadian recorded a disk failure for the virt32b and virt64b nodes [ ], performed some maintenance of the cbxi4a node [ ][ ] and marked most armhf architecture systems as being back online.

Finally, If you are interested in contributing to the Reproducible Builds project, please visit our Contribute page on our website. However, you can get in touch with us via:

1 October 2024

Guido G nther: Free Software Activities September 2024

Another short status update of what happened on my side last month. Besides the usual amount of housekeeping last month was a lot about getting old issues resolved by finishing some stale merge requests and work in pogress MRs. I also pushed out the Phosh 0.42.0 Release phosh phoc phosh-mobile-settings libphosh-rs phosh-osk-stub phosh-wallpapers meta-phosh Debian ModemManager Calls bluez gnome-text-editor feedbackd Chatty libcall-ui glib wlr-protocols git-buildpackage iio-sensor-proxy Fotema Help Development If you want to support my work see donations. This includes a list of hardware we want to improve support for. Thanks a lot to all current and past donors.

29 September 2024

Reproducible Builds: Supporter spotlight: Kees Cook on Linux kernel security

The Reproducible Builds project relies on several projects, supporters and sponsors for financial support, but they are also valued as ambassadors who spread the word about our project and the work that we do. This is the eighth installment in a series featuring the projects, companies and individuals who support the Reproducible Builds project. We started this series by featuring the Civil Infrastructure Platform project, and followed this up with a post about the Ford Foundation as well as recent ones about ARDC, the Google Open Source Security Team (GOSST), Bootstrappable Builds, the F-Droid project, David A. Wheeler and Simon Butler. Today, however, we will be talking with Kees Cook, founder of the Kernel Self-Protection Project.

Vagrant Cascadian: Could you tell me a bit about yourself? What sort of things do you work on? Kees Cook: I m a Free Software junkie living in Portland, Oregon, USA. I have been focusing on the upstream Linux kernel s protection of itself. There is a lot of support that the kernel provides userspace to defend itself, but when I first started focusing on this there was not as much attention given to the kernel protecting itself. As userspace got more hardened the kernel itself became a bigger target. Almost 9 years ago I formally announced the Kernel Self-Protection Project because the work necessary was way more than my time and expertise could do alone. So I just try to get people to help as much as possible; people who understand the ARM architecture, people who understand the memory management subsystem to help, people who understand how to make the kernel less buggy.
Vagrant: Could you describe the path that lead you to working on this sort of thing? Kees: I have always been interested in security through the aspect of exploitable flaws. I always thought it was like a magic trick to make a computer do something that it was very much not designed to do and seeing how easy it is to subvert bugs. I wanted to improve that fragility. In 2006, I started working at Canonical on Ubuntu and was mainly focusing on bringing Debian and Ubuntu up to what was the state of the art for Fedora and Gentoo s security hardening efforts. Both had really pioneered a lot of userspace hardening with compiler flags and ELF stuff and many other things for hardened binaries. On the whole, Debian had not really paid attention to it. Debian s packaging building process at the time was sort of a chaotic free-for-all as there wasn t centralized build methodology for defining things. Luckily that did slowly change over the years. In Ubuntu we had the opportunity to apply top down build rules for hardening all the packages. In 2011 Chrome OS was following along and took advantage of a bunch of the security hardening work as they were based on ebuild out of Gentoo and when they looked for someone to help out they reached out to me. We recognized the Linux kernel was pretty much the weakest link in the Chrome OS security posture and I joined them to help solve that. Their userspace was pretty well handled but the kernel had a lot of weaknesses, so focusing on hardening was the next place to go. When I compared notes with other users of the Linux kernel within Google there were a number of common concerns and desires. Chrome OS already had an upstream first requirement, so I tried to consolidate the concerns and solve them upstream. It was challenging to land anything in other kernel team repos at Google, as they (correctly) wanted to minimize their delta from upstream, so I needed to work on any major improvements entirely in upstream and had a lot of support from Google to do that. As such, my focus shifted further from working directly on Chrome OS into being entirely upstream and being more of a consultant to internal teams, helping with integration or sometimes backporting. Since the volume of needed work was so gigantic I needed to find ways to inspire other developers (both inside and outside of Google) to help. Once I had a budget I tried to get folks paid (or hired) to work on these areas when it wasn t already their job.
Vagrant: So my understanding of some of your recent work is basically defining undefined behavior in the language or compiler? Kees: I ve found the term undefined behavior to have a really strict meaning within the compiler community, so I have tried to redefine my goal as eliminating unexpected behavior or ambiguous language constructs . At the end of the day ambiguity leads to bugs, and bugs lead to exploitable security flaws. I ve been taking a four-pronged approach: supporting the work people are doing to get rid of ambiguity, identify new areas where ambiguity needs to be removed, actually removing that ambiguity from the C language, and then dealing with any needed refactoring in the Linux kernel source to adapt to the new constraints. None of this is particularly novel; people have recognized how dangerous some of these language constructs are for decades and decades but I think it is a combination of hard problems and a lot of refactoring that nobody has the interest/resources to do. So, we have been incrementally going after the lowest hanging fruit. One clear example in recent years was the elimination of C s implicit fall-through in switch statements. The language would just fall through between adjacent cases if a break (or other code flow directive) wasn t present. But this is ambiguous: is the code meant to fall-through, or did the author just forget a break statement? By defining the [[fallthrough]] statement, and requiring its use in Linux, all switch statements now have explicit code flow, and the entire class of bugs disappeared. During our refactoring we actually found that 1 in 10 added [[fallthrough]] statements were actually missing break statements. This was an extraordinarily common bug! So getting rid of that ambiguity is where we have been. Another area I ve been spending a bit of time on lately is looking at how defensive security work has challenges associated with metrics. How do you measure your defensive security impact? You can t say because we installed locks on the doors, 20% fewer break-ins have happened. Much of our signal is always secondary or retrospective, which is frustrating: This class of flaw was used X much over the last decade so, and if we have eliminated that class of flaw and will never see it again, what is the impact? Is the impact infinity? Attackers will just move to the next easiest thing. But it means that exploitation gets incrementally more difficult. As attack surfaces are reduced, the expense of exploitation goes up.
Vagrant: So it is hard to identify how effective this is how bad would it be if people just gave up? Kees: I think it would be pretty bad, because as we have seen, using secondary factors, the work we have done in the industry at large, not just the Linux kernel, has had an impact. What we, Microsoft, Apple, and everyone else is doing for their respective software ecosystems, has shown that the price of functional exploits in the black market has gone up. Especially for really egregious stuff like a zero-click remote code execution. If those were cheap then obviously we are not doing something right, and it becomes clear that it s trivial for anyone to attack the infrastructure that our lives depend on. But thankfully we have seen over the last two decades that prices for exploits keep going up and up into millions of dollars. I think it is important to keep working on that because, as a central piece of modern computer infrastructure, the Linux kernel has a giant target painted on it. If we give up, we have to accept that our computers are not doing what they were designed to do, which I can t accept. The safety of my grandparents shouldn t be any different from the safety of journalists, and political activists, and anyone else who might be the target of attacks. We need to be able to trust our devices otherwise why use them at all?
Vagrant: What has been your biggest success in recent years? Kees: I think with all these things I am not the only actor. Almost everything that we have been successful at has been because of a lot of people s work, and one of the big ones that has been coordinated across the ecosystem and across compilers was initializing stack variables to 0 by default. This feature was added in Clang, GCC, and MSVC across the board even though there were a lot of fears about forking the C language. The worry was that developers would come to depend on zero-initialized stack variables, but this hasn t been the case because we still warn about uninitialized variables when the compiler can figure that out. So you still still get the warnings at compile time but now you can count on the contents of your stack at run-time and we drop an entire class of uninitialized variable flaws. While the exploitation of this class has mostly been around memory content exposure, it has also been used for control flow attacks. So that was politically and technically a large challenge: convincing people it was necessary, showing its utility, and implementing it in a way that everyone would be happy with, resulting in the elimination of a large and persistent class of flaws in C.
Vagrant: In a world where things are generally Reproducible do you see ways in which that might affect your work? Kees: One of the questions I frequently get is, What version of the Linux kernel has feature $foo? If I know how things are built, I can answer with just a version number. In a Reproducible Builds scenario I can count on the compiler version, compiler flags, kernel configuration, etc. all those things are known, so I can actually answer definitively that a certain feature exists. So that is an area where Reproducible Builds affects me most directly. Indirectly, it is just being able to trust the binaries you are running are going to behave the same for the same build environment is critical for sane testing.
Vagrant: Have you used diffoscope? Kees: I have! One subset of tree-wide refactoring that we do when getting rid of ambiguous language usage in the kernel is when we have to make source level changes to satisfy some new compiler requirement but where the binary output is not expected to change at all. It is mostly about getting the compiler to understand what is happening, what is intended in the cases where the old ambiguity does actually match the new unambiguous description of what is intended. The binary shouldn t change. We have used diffoscope to compare the before and after binaries to confirm that yep, there is no change in binary .
Vagrant: You cannot just use checksums for that? Kees: For the most part, we need to only compare the text segments. We try to hold as much stable as we can, following the Reproducible Builds documentation for the kernel, but there are macros in the kernel that are sensitive to source line numbers and as a result those will change the layout of the data segment (and sometimes the text segment too). With diffoscope there s flexibility where I can exclude or include different comparisons. Sometimes I just go look at what diffoscope is doing and do that manually, because I can tweak that a little harder, but diffoscope is definitely the default. Diffoscope is awesome!
Vagrant: Where has reproducible builds affected you? Kees: One of the notable wins of reproducible builds lately was dealing with the fallout of the XZ backdoor and just being able to ask the question is my build environment running the expected code? and to be able to compare the output generated from one install that never had a vulnerable XZ and one that did have a vulnerable XZ and compare the results of what you get. That was important for kernel builds because the XZ threat actor was working to expand their influence and capabilities to include Linux kernel builds, but they didn t finish their work before they were noticed. I think what happened with Debian proving the build infrastructure was not affected is an important example of how people would have needed to verify the kernel builds too.
Vagrant: What do you want to see for the near or distant future in security work? Kees: For reproducible builds in the kernel, in the work that has been going on in the ClangBuiltLinux project, one of the driving forces of code and usability quality has been the continuous integration work. As soon as something breaks, on the kernel side, the Clang side, or something in between the two, we get a fast signal and can chase it and fix the bugs quickly. I would like to see someone with funding to maintain a reproducible kernel build CI. There have been places where there are certain architecture configurations or certain build configuration where we lose reproducibility and right now we have sort of a standard open source development feedback loop where those things get fixed but the time in between introduction and fix can be large. Getting a CI for reproducible kernels would give us the opportunity to shorten that time.
Vagrant: Well, thanks for that! Any last closing thoughts? Kees: I am a big fan of reproducible builds, thank you for all your work. The world is a safer place because of it.
Vagrant: Likewise for your work!


For more information about the Reproducible Builds project, please see our website at reproducible-builds.org. If you are interested in ensuring the ongoing security of the software that underpins our civilisation and wish to sponsor the Reproducible Builds project, please reach out to the project by emailing contact@reproducible-builds.org.

26 September 2024

Melissa Wen: Reflections on 2024 Linux Display Next Hackfest

Hey everyone! The 2024 Linux Display Next hackfest concluded in May, and its outcomes continue to shape the Linux Display stack. Igalia hosted this year s event in A Coru a, Spain, bringing together leading experts in the field. Samuel Iglesias and I organized this year s edition and this blog post summarizes the experience and its fruits. One of the highlights of this year s hackfest was the wide range of backgrounds represented by our 40 participants (both on-site and remotely). Developers and experts from various companies and open-source projects came together to advance the Linux Display ecosystem. You can find the list of participants here. The event covered a broad spectrum of topics affecting the development of Linux projects, user experiences, and the future of display technologies on Linux. From cutting-edge topics to long-term discussions, you can check the event agenda here.

Organization Highlights The hackfest was marked by in-depth discussions and knowledge sharing among Linux contributors, making everyone inspired, informed, and connected to the community. Building on feedback from the previous year, we refined the unconference format to enhance participant preparation and engagement. Structured Agenda and Timeboxes: Each session had a defined scope, time limit (1h20 or 2h10), and began with an introductory talk on the topic.
  • Participant-Led Discussions: We pre-selected in-person participants to lead discussions, allowing them to prepare introductions, resources, and scope.
  • Transparent Scheduling: The schedule was shared in advance as GitHub issues, encouraging participants to review and prepare for sessions of interest.
Engaging Sessions: The hackfest featured a variety of topics, including presentations and discussions on how participants were addressing specific subjects within their companies.
  • No Breakout Rooms, No Overlaps: All participants chose to attend all sessions, eliminating the need for separate breakout rooms. We also adapted run-time schedule to keep everybody involved in the same topics.
  • Real-time Updates: We provided notifications and updates through dedicated emails and the event matrix room.
Strengthening Community Connections: The hackfest offered ample opportunities for networking among attendees.
  • Social Events: Igalia sponsored coffee breaks, lunches, and a dinner at a local restaurant.
  • Museum Visit: Participants enjoyed a sponsored visit to the Museum of Estrela Galicia Beer (MEGA).

Fruitful Discussions and Follow-up The structured agenda and breaks allowed us to cover multiple topics during the hackfest. These discussions have led to new display feature development and improvements, as evidenced by patches, merge requests, and implementations in project repositories and mailing lists. With the KMS color management API taking shape, we discussed refinements and best approaches to cover the variety of color pipeline from different hardware-vendors. We are also investigating techniques for a performant SDR<->HDR content reproduction and reducing latency and power consumption when using the color blocks of the hardware.

Color Management/HDR Color Management and HDR continued to be the hottest topic of the hackfest. We had three sessions dedicated to discuss Color and HDR across Linux Display stack layers.

Color/HDR (Kernel-Level) Harry Wentland (AMD) led this session. Here, kernel Developers shared the Color Management pipeline of AMD, Intel and NVidia. We counted with diagrams and explanations from HW-vendors developers that discussed differences, constraints and paths to fit them into the KMS generic color management properties such as advertising modeset needs, IN\_FORMAT, segmented LUTs, interpolation types, etc. Developers from Qualcomm and ARM also added information regarding their hardware. Upstream work related to this session:

Color/HDR (Compositor-Level) Sebastian Wick (RedHat) led this session. It started with Sebastian s presentation covering Wayland color protocols and compositor implementation. Also, an explanation of APIs provided by Wayland and how they can be used to achieve better color management for applications and discussions around ICC profiles and color representation metadata. There was also an intensive Q&A about LittleCMS with Marti Maria. Upstream work related to this session:

Color/HDR (Use Cases and Testing) Christopher Cameron (Google) and Melissa Wen (Igalia) led this session. In contrast to the other sessions, here we focused less on implementation and more on brainstorming and reflections of real-world SDR and HDR transformations (use and validation) and gainmaps. Christopher gave a nice presentation explaining HDR gainmap images and how we should think of HDR. This presentation and Q&A were important to put participants at the same page of how to transition between SDR and HDR and somehow emulating HDR. We also discussed on the usage of a kernel background color property. Finally, we discussed a bit about Chamelium and the future of VKMS (future work and maintainership).

Power Savings vs Color/Latency Mario Limonciello (AMD) led this session. Mario gave an introductory presentation about AMD ABM (adaptive backlight management) that is similar to Intel DPST. After some discussions, we agreed on exposing a kernel property for power saving policy. This work was already merged on kernel and the userspace support is under development. Upstream work related to this session:

Strategy for video and gaming use-cases Leo Li (AMD) led this session. Miguel Casas (Google) started this session with a presentation of Overlays in Chrome/OS Video, explaining the main goal of power saving by switching off GPU for accelerated compositing and the challenges of different colorspace/HDR for video on Linux. Then Leo Li presented different strategies for video and gaming and we discussed the userspace need of more detailed feedback mechanisms to understand failures when offloading. Also, creating a debugFS interface came up as a tool for debugging and analysis.

Real-time scheduling and async KMS API Xaver Hugl (KDE/BlueSystems) led this session. Compositor developers have exposed some issues with doing real-time scheduling and async page flips. One is that the Kernel limits the lifetime of realtime threads and if a modeset takes too long, the thread will be killed and thus the compositor as well. Also, simple page flips take longer than expected and drivers should optimize them. Another issue is the lack of feedback to compositors about hardware programming time and commit deadlines (the lastest possible time to commit). This is difficult to predict from drivers, since it varies greatly with the type of properties. For example, color management updates take much longer. In this regard, we discusssed implementing a hw_done callback to timestamp when the hardware programming of the last atomic commit is complete. Also an API to pre-program color pipeline in a kind of A/B scheme. It may not be supported by all drivers, but might be useful in different ways.

VRR/Frame Limit, Display Mux, Display Control, and more and beer We also had sessions to discuss a new KMS API to mitigate headaches on VRR and Frame Limit as different brightness level at different refresh rates, abrupt changes of refresh rates, low frame rate compensation (LFC) and precise timing in VRR more. On Display Control we discussed features missing in the current KMS interface for HDR mode, atomic backlight settings, source-based tone mapping, etc. We also discussed the need of a place where compositor developers can post TODOs to be developed by KMS people. The Content-adaptive Scaling and Sharpening session focused on sharpening and scaling filters. In the Display Mux session, we discussed proposals to expose the capability of dynamic mux switching display signal between discrete and integrated GPUs. In the last session of the 2024 Display Next Hackfest, participants representing different compositors summarized current and future work and built a Linux Display wish list , which includes: improvements to VTTY and HDR switching, better dmabuf API for multi-GPU support, definition of tone mapping, blending and scaling sematics, and wayland protocols for advertising to clients which colorspaces are supported. We closed this session with a status update on feature development by compositors, including but not limited to: plane offloading (from libcamera to output) / HDR video offloading (dma-heaps) / plane-based scrolling for web pages, color management / HDR / ICC profiles support, addressing issues such as flickering when color primaries don t match, etc. After three days of intensive discussions, all in-person participants went to a guided tour at the Museum of Extrela Galicia beer (MEGA), pouring and tasting the most famous local beer.

Feedback and Future Directions Participants provided valuable feedback on the hackfest, including suggestions for future improvements.
  • Schedule and Break-time Setup: Having a pre-defined agenda and schedule provided a better balance between long discussions and mental refreshments, preventing the fatigue caused by endless discussions.
  • Action Points: Some participants recommended explicitly asking for action points at the end of each session and assigning people to follow-up tasks.
  • Remote Participation: Remote attendees appreciated the inclusive setup and opportunities to actively participate in discussions.
  • Technical Challenges: There were bandwidth and video streaming issues during some sessions due to the large number of participants.

Thank you for joining the 2024 Display Next Hackfest We can t help but thank the 40 participants, who engaged in-person or virtually on relevant discussions, for a collaborative evolution of the Linux display stack and for building an insightful agenda. A big thank you to the leaders and presenters of the nine sessions: Christopher Cameron (Google), Harry Wentland (AMD), Leo Li (AMD), Mario Limoncello (AMD), Sebastian Wick (RedHat) and Xaver Hugl (KDE/BlueSystems) for the effort in preparing the sessions, explaining the topic and guiding discussions. My acknowledge to the others in-person participants that made such an effort to travel to A Coru a: Alex Goins (NVIDIA), David Turner (Raspberry Pi), Georges Stavracas (Igalia), Joan Torres (SUSE), Liviu Dudau (Arm), Louis Chauvet (Bootlin), Robert Mader (Collabora), Tian Mengge (GravityXR), Victor Jaquez (Igalia) and Victoria Brekenfeld (System76). It was and awesome opportunity to meet you and chat face-to-face. Finally, thanks virtual participants who couldn t make it in person but organized their days to actively participate in each discussion, adding different perspectives and valuable inputs even remotely: Abhinav Kumar (Qualcomm), Chaitanya Borah (Intel), Christopher Braga (Qualcomm), Dor Askayo (Red Hat), Jiri Koten (RedHat), Jonas dahl (Red Hat), Leandro Ribeiro (Collabora), Marti Maria (Little CMS), Marijn Suijten, Mario Kleiner, Martin Stransky (Red Hat), Michel D nzer (Red Hat), Miguel Casas-Sanchez (Google), Mitulkumar Golani (Intel), Naveen Kumar (Intel), Niels De Graef (Red Hat), Pekka Paalanen (Collabora), Pichika Uday Kiran (AMD), Shashank Sharma (AMD), Sriharsha PV (AMD), Simon Ser, Uma Shankar (Intel) and Vikas Korjani (AMD). We look forward to another successful Display Next hackfest, continuing to drive innovation and improvement in the Linux display ecosystem!

25 September 2024

Melissa Wen: Reflections on 2024 Linux Display Next Hackfest

Hey everyone! The 2024 Linux Display Next hackfest concluded in May, and its outcomes continue to shape the Linux Display stack. Igalia hosted this year s event in A Coru a, Spain, bringing together leading experts in the field. Samuel Iglesias and I organized this year s edition and this blog post summarizes the experience and its fruits. One of the highlights of this year s hackfest was the wide range of backgrounds represented by our 40 participants (both on-site and remotely). Developers and experts from various companies and open-source projects came together to advance the Linux Display ecosystem. You can find the list of participants here. The event covered a broad spectrum of topics affecting the development of Linux projects, user experiences, and the future of display technologies on Linux. From cutting-edge topics to long-term discussions, you can check the event agenda here.

Organization Highlights The hackfest was marked by in-depth discussions and knowledge sharing among Linux contributors, making everyone inspired, informed, and connected to the community. Building on feedback from the previous year, we refined the unconference format to enhance participant preparation and engagement. Structured Agenda and Timeboxes: Each session had a defined scope, time limit (1h20 or 2h10), and began with an introductory talk on the topic.
  • Participant-Led Discussions: We pre-selected in-person participants to lead discussions, allowing them to prepare introductions, resources, and scope.
  • Transparent Scheduling: The schedule was shared in advance as GitHub issues, encouraging participants to review and prepare for sessions of interest.
Engaging Sessions: The hackfest featured a variety of topics, including presentations and discussions on how participants were addressing specific subjects within their companies.
  • No Breakout Rooms, No Overlaps: All participants chose to attend all sessions, eliminating the need for separate breakout rooms. We also adapted run-time schedule to keep everybody involved in the same topics.
  • Real-time Updates: We provided notifications and updates through dedicated emails and the event matrix room.
Strengthening Community Connections: The hackfest offered ample opportunities for networking among attendees.
  • Social Events: Igalia sponsored coffee breaks, lunches, and a dinner at a local restaurant.
  • Museum Visit: Participants enjoyed a sponsored visit to the Museum of Estrela Galicia Beer (MEGA).

Fruitful Discussions and Follow-up The structured agenda and breaks allowed us to cover multiple topics during the hackfest. These discussions have led to new display feature development and improvements, as evidenced by patches, merge requests, and implementations in project repositories and mailing lists. With the KMS color management API taking shape, we discussed refinements and best approaches to cover the variety of color pipeline from different hardware-vendors. We are also investigating techniques for a performant SDR<->HDR content reproduction and reducing latency and power consumption when using the color blocks of the hardware.

Color Management/HDR Color Management and HDR continued to be the hottest topic of the hackfest. We had three sessions dedicated to discuss Color and HDR across Linux Display stack layers.

Color/HDR (Kernel-Level) Harry Wentland (AMD) led this session. Here, kernel Developers shared the Color Management pipeline of AMD, Intel and NVidia. We counted with diagrams and explanations from HW-vendors developers that discussed differences, constraints and paths to fit them into the KMS generic color management properties such as advertising modeset needs, IN\_FORMAT, segmented LUTs, interpolation types, etc. Developers from Qualcomm and ARM also added information regarding their hardware. Upstream work related to this session:

Color/HDR (Compositor-Level) Sebastian Wick (RedHat) led this session. It started with Sebastian s presentation covering Wayland color protocols and compositor implementation. Also, an explanation of APIs provided by Wayland and how they can be used to achieve better color management for applications and discussions around ICC profiles and color representation metadata. There was also an intensive Q&A about LittleCMS with Marti Maria. Upstream work related to this session:

Color/HDR (Use Cases and Testing) Christopher Cameron (Google) and Melissa Wen (Igalia) led this session. In contrast to the other sessions, here we focused less on implementation and more on brainstorming and reflections of real-world SDR and HDR transformations (use and validation) and gainmaps. Christopher gave a nice presentation explaining HDR gainmap images and how we should think of HDR. This presentation and Q&A were important to put participants at the same page of how to transition between SDR and HDR and somehow emulating HDR. We also discussed on the usage of a kernel background color property. Finally, we discussed a bit about Chamelium and the future of VKMS (future work and maintainership).

Power Savings vs Color/Latency Mario Limonciello (AMD) led this session. Mario gave an introductory presentation about AMD ABM (adaptive backlight management) that is similar to Intel DPST. After some discussions, we agreed on exposing a kernel property for power saving policy. This work was already merged on kernel and the userspace support is under development. Upstream work related to this session:

Strategy for video and gaming use-cases Leo Li (AMD) led this session. Miguel Casas (Google) started this session with a presentation of Overlays in Chrome/OS Video, explaining the main goal of power saving by switching off GPU for accelerated compositing and the challenges of different colorspace/HDR for video on Linux. Then Leo Li presented different strategies for video and gaming and we discussed the userspace need of more detailed feedback mechanisms to understand failures when offloading. Also, creating a debugFS interface came up as a tool for debugging and analysis.

Real-time scheduling and async KMS API Xaver Hugl (KDE/BlueSystems) led this session. Compositor developers have exposed some issues with doing real-time scheduling and async page flips. One is that the Kernel limits the lifetime of realtime threads and if a modeset takes too long, the thread will be killed and thus the compositor as well. Also, simple page flips take longer than expected and drivers should optimize them. Another issue is the lack of feedback to compositors about hardware programming time and commit deadlines (the lastest possible time to commit). This is difficult to predict from drivers, since it varies greatly with the type of properties. For example, color management updates take much longer. In this regard, we discusssed implementing a hw_done callback to timestamp when the hardware programming of the last atomic commit is complete. Also an API to pre-program color pipeline in a kind of A/B scheme. It may not be supported by all drivers, but might be useful in different ways.

VRR/Frame Limit, Display Mux, Display Control, and more and beer We also had sessions to discuss a new KMS API to mitigate headaches on VRR and Frame Limit as different brightness level at different refresh rates, abrupt changes of refresh rates, low frame rate compensation (LFC) and precise timing in VRR more. On Display Control we discussed features missing in the current KMS interface for HDR mode, atomic backlight settings, source-based tone mapping, etc. We also discussed the need of a place where compositor developers can post TODOs to be developed by KMS people. The Content-adaptive Scaling and Sharpening session focused on sharpening and scaling filters. In the Display Mux session, we discussed proposals to expose the capability of dynamic mux switching display signal between discrete and integrated GPUs. In the last session of the 2024 Display Next Hackfest, participants representing different compositors summarized current and future work and built a Linux Display wish list , which includes: improvements to VTTY and HDR switching, better dmabuf API for multi-GPU support, definition of tone mapping, blending and scaling sematics, and wayland protocols for advertising to clients which colorspaces are supported. We closed this session with a status update on feature development by compositors, including but not limited to: plane offloading (from libcamera to output) / HDR video offloading (dma-heaps) / plane-based scrolling for web pages, color management / HDR / ICC profiles support, addressing issues such as flickering when color primaries don t match, etc. After three days of intensive discussions, all in-person participants went to a guided tour at the Museum of Extrela Galicia beer (MEGA), pouring and tasting the most famous local beer.

Feedback and Future Directions Participants provided valuable feedback on the hackfest, including suggestions for future improvements.
  • Schedule and Break-time Setup: Having a pre-defined agenda and schedule provided a better balance between long discussions and mental refreshments, preventing the fatigue caused by endless discussions.
  • Action Points: Some participants recommended explicitly asking for action points at the end of each session and assigning people to follow-up tasks.
  • Remote Participation: Remote attendees appreciated the inclusive setup and opportunities to actively participate in discussions.
  • Technical Challenges: There were bandwidth and video streaming issues during some sessions due to the large number of participants.

Thank you for joining the 2024 Display Next Hackfest We can t help but thank the 40 participants, who engaged in-person or virtually on relevant discussions, for a collaborative evolution of the Linux display stack and for building an insightful agenda. A big thank you to the leaders and presenters of the nine sessions: Christopher Cameron (Google), Harry Wentland (AMD), Leo Li (AMD), Mario Limoncello (AMD), Sebastian Wick (RedHat) and Xaver Hugl (KDE/BlueSystems) for the effort in preparing the sessions, explaining the topic and guiding discussions. My acknowledge to the others in-person participants that made such an effort to travel to A Coru a: Alex Goins (NVIDIA), David Turner (Raspberry Pi), Georges Stavracas (Igalia), Joan Torres (SUSE), Liviu Dudau (Arm), Louis Chauvet (Bootlin), Robert Mader (Collabora), Tian Mengge (GravityXR), Victor Jaquez (Igalia) and Victoria Brekenfeld (System76). It was and awesome opportunity to meet you and chat face-to-face. Finally, thanks virtual participants who couldn t make it in person but organized their days to actively participate in each discussion, adding different perspectives and valuable inputs even remotely: Abhinav Kumar (Qualcomm), Chaitanya Borah (Intel), Christopher Braga (Qualcomm), Dor Askayo, Jiri Koten (RedHat), Jonas dahl (Red Hat), Leandro Ribeiro (Collabora), Marti Maria (Little CMS), Marijn Suijten, Mario Kleiner, Martin Stransky (Red Hat), Michel D nzer (Red Hat), Miguel Casas-Sanchez (Google), Mitulkumar Golani (Intel), Naveen Kumar (Intel), Niels De Graef (Red Hat), Pekka Paalanen (Collabora), Pichika Uday Kiran (AMD), Shashank Sharma (AMD), Sriharsha PV (AMD), Simon Ser, Uma Shankar (Intel) and Vikas Korjani (AMD). We look forward to another successful Display Next hackfest, continuing to drive innovation and improvement in the Linux display ecosystem!

23 September 2024

Jonathan McDowell: The (lack of a) return-to-office conspiracy

During COVID companies suddenly found themselves able to offer remote working where it hadn t previously been on offer. That s changed over the past 2 or so years, with most places I m aware of moving back from a fully remote situation to either some sort of hybrid, or even full time office attendance. For example last week Amazon announced a full return to office, having already pulled remote-hired workers in for 3 days a week. I ve seen a lot of folk stating they ll never work in an office again, and that RTO is insanity. Despite being lucky enough to work fully remotely (for a role I d been approached about before, but was never prepared to relocate for), I feel the objections from those who are pro-remote often fail to consider the nuances involved. So let s talk about some of the reasons why companies might want to enforce some sort of RTO.

Real estate value Let s clear this one up first. It s not about real estate value, for most companies. City planners and real estate investors might care, but even if your average company owned their building they d close it in an instant all other things being equal. An unoccupied building costs a lot less to maintain. And plenty of companies rent and would save money even if there s a substantial exit fee.

Occupancy levels That said, once you have anyone in the building the equation changes. If you re having to provide power, heating, internet, security/front desk staff etc, you want to make sure you re getting your money s worth. There s no point heating a building that can seat 100 for only 10 people present. One option is to downsize the building, but that leads to not being able to assign everyone a desk, for example. No one I know likes hot desking. There are also scheduling problems about ensuring there are enough desks for everyone who might turn up on a certain day, and you ve ruled out the option of company/office wide events.

Coexistence builds relationships As a remote worker I wish it wasn t true that most people find it easier to form relationships in person, but it is. Some of this can be worked on with specific teambuilding style events, rather than in office working, but I know plenty of folk who hate those as much as they hate the idea of being in the office. I am lucky in that I work with a bunch of folk who are terminally online, so it s much easier to have those casual conversations even being remote, but I also accept I miss out on some things because I m just not in the office regularly enough. You might not care about this ( I just need to put my head down and code, not talk to people ), but don t discount it as a valid reason why companies might want their workers to be in the office. This often matters even more for folk at the start of their career, where having a bunch of experience folk around to help them learn and figure things out ends up working much better in person (my first job offered to let me go mostly remote when I moved to Norwich, but I said no as I knew I wasn t ready for it yet).

Coexistence allows for unexpected interactions People hate the phrase water cooler chat , and I get that, but it covers the idea of casual conversations that just won t happen the same way when people are remote. I experienced this while running Black Cat; every time Simon and I met up in person we had a bunch of useful conversations even though we were on IRC together normally, and had a VoIP setup that meant we regularly talked too. Equally when I was at Nebulon there were conversations I overheard in the office where I was able to correct a misconception or provide extra context. Some of this can be replicated with the right online chat culture, but I ve found many places end up with folk taking conversations to DMs, or they happen in private channels. It happens more naturally in an office environment.

It s easier for bad managers to manage bad performers Again, this falls into the category of things that shouldn t be true, but are. Remote working has increased the ability for people who want to slack off to do so without being easily detected. Ideally what you want is that these folk, if they fail to perform, are then performance managed out of the organisation. That s hard though, there are (rightly) a bunch of rights workers have (I m writing from a UK perspective) around the procedure that needs to be followed. Managers need organisational support in this to make sure they get it right (and folk are given a chance to improve), which is often lacking.

Summary Look, I get there are strong reasons why offering remote is a great thing from the company perspective, but what I ve tried to outline here is that a return-to-office mandate can have some compelling reasons behind it too. Some of those might be things that wouldn t exist in an ideal world, but unfortunately fixing them is a bigger issue than just changing where folk work from. Not acknowledging that just makes any reaction against office work seem ill-informed, to me.

17 September 2024

Jonathan Dowland: ouch, part 3

The debridement operation was a success: nothing bad grew afterwards. I was discharged after a couple of nights with crutches, instructions not to weight-bear, a remarkable, portable negative-pressure "Vac" pump that lived by my side, and some strong painkillers. About two weeks later, I had a skin graft. The surgeon took some skin from my thigh and stitched it over the debridement wound. I was discharged same-day, again with the Vac pump, and again with instructions not to weight-bear, at least for a few days. This time I only kept the Vac pump for a week, and after a dressing change (the first time I saw the graft), I was allowed to walk again. Doing so is strangely awkward, and sometimes a little painful. I have physio exercises to help me regain strength and understanding about what I can do. The donor site remained bandaged for another week before I saw it. I was expecting a stitched cut, but the surgeons have removed the top few layers only, leaving what looks more like a graze or sun-burn. There are four smaller, tentative-looking marks adjacent, suggesting they got it right on the fifth attempt. I'm not sure but I think these will all fade away to near-invisibility with time, and they don't hurt at all. I've now been off work for roughly 12 weeks, but I think I am returning very soon. I am looking forward to returning to some sense of normality. It's been an interesting experience. I thought about writing more about what I've gone through, in particular my experiences in Hospital, dealing with the bureaucracy and things falling "between the gaps". Hanif Kureishi has done a better job than I could. It's clear that the NHS is staffed by incredibly passionate people, but there are a lot of structural problems that interfere with care.

Russ Allbery: Review: The Book That Broke the World

Review: The Book That Broke the World, by Mark Lawrence
Series: Library Trilogy #2
Publisher: Ace
Copyright: 2024
ISBN: 0-593-43796-9
Format: Kindle
Pages: 366
The Book That Broke the World is high fantasy and a direct sequel to The Book That Wouldn't Burn. You should not start here. In a delightful break from normal practice, the author provides a useful summary of the previous volume at the start of this book to jog your memory. At the end of The Book That Wouldn't Burn, the characters were scattered and in various states of corporeality after some major revelations about the nature of the Library and the first appearance of the insectile Skeer. The Book That Wouldn't Burn picks up where it left off, and there is a lot more contact with the Skeer, but my guess that they would be the next viewpoint characters does not pan out. Instead, we get a new group and a new protagonist: Celcha, whose sees angels who come to visit her brother. I have complaints, but before I launch into those, I should say that I liked this book apart from the totally unnecessary cannibalism. (I'll get to that.) Livira is a bit sidelined, which is regrettable, but Celcha and her brother are interesting new characters, and both Arpix and Clovis, supporting characters in the first book, get some excellent character development. Similar to the first book, this is a puzzle box story full of world-building tidbits with intellectually-satisfying interactions. Lawrence elaborates and complicates his setting in ways that don't contradict earlier parts of the story but create more room and depth for the characters to be creative. I came away still invested in this world and eager to find out how Lawrence pulls the world-building and narrative threads together. The biggest drawback of this book is that it's not new. My thought after finishing the first book of the series was that if Lawrence had enough world-building ideas to fill three books to that same level of density, this had the potential of being one of my favorite fantasy series of all time. By the end of the second book, I concluded that this is not the case. Instead of showing us new twists and complications the way the first book did throughout, The Book That Broke the World mostly covers the same thematic ground from some new angles. It felt like Lawrence was worried the reader of the first book may not have understood the theme or the world-building, so he spent most of the second book nailing down anything that moved. I found that frustrating. One of the best parts of The Book That Wouldn't Burn was that Lawrence trusted the reader to keep up, which for me hit the glorious but rare sweet spot of pacing where I was figuring out the world at roughly the same pace as the characters. It surprised me in some very enjoyable ways. The Book That Broke the World did not surprise me. There are a few new things, which I enjoyed, and a few elaborations and developments of ideas, which I mostly enjoyed, but I saw the big plot twist coming at least fifty pages before it happened and found the aftermath more annoying than revelatory. It doesn't help that the plot rests on character misunderstandings, one of my least favorite tropes. One of the other disappointments of this book is that the characters stop using the Library as a library. The Library at the center of this series is a truly marvelous piece of world-building with numerous fascinating features that are unrelated to its contents, but Livira used it first and foremost as a repository of books. The first book was full of characters solving problems by finding a relevant book and reading it. In The Book That Broke the World, sadly, this is mostly gone. The Library is mostly reduced to a complicated Big Dumb Object setting. It's still a delightful bit of world-building, and we learn about a few new features, but I only remember two places where the actual books are important to the story. Even the book referenced in the title is mostly important as an artifact with properties unrelated to the words that it contains or to the act of reading it. I think this is a huge lost opportunity and something I hope Lawrence fixes in the last book of the trilogy. This book instead focuses on the politics around the existence of the Library itself. Here I'm cautiously optimistic, although a lot is going to depend on the third book. Lawrence has set up a three-sided argument between groups that I will uncharitably describe as the libertarian techbros, the "burn it all down" reactionaries, and the neoliberal centrist technocrats. All three of those positions suck, and Lawrence had better be setting the stage for Livira to find a different path. Her unwillingness to commit to any of those sides gives me hope, but bringing this plot to a satisfying conclusion is going to be tricky. I hope I like what Lawrence comes up with, but it feels far from certain. It doesn't help that he's started delivering some points with a sledgehammer, and that's where we get to the unnecessary cannibalism. Thankfully this is a fairly small part of the tail end of the book, but it was an unpleasant surprise that I did not want in this novel and that I don't think made the story any better. It's tempting to call the cannibalism gratuitous, but it does fit one of the main themes of this story, namely that humans are depressingly good at using any rule-based object in unexpected and nasty ways that are contrary to the best intentions of the designer. This is the fundamental challenge of the Library as a whole and the question that I suspect the third book will be devoted to addressing, so I understand why Lawrence wanted to emphasize his point. The reason why there is cannibalism here is directly related to a profound misunderstanding of the properties of the library, and I detected an echo of one of C.S. Lewis's arguments in The Last Battle about the nature of Hell. The problem, though, is that this is Satanic baby-killerism, to borrow a term from Fred Clark. There are numerous ways to show this type of perversion of well-intended systems, which I know because Lawrence used other ones in the first book that were more subtle but equally effective. One of the best parts of The Book That Wouldn't Burn is that there were few real villains. The conflict was structural, all sides had valid perspectives, and the ethical points of that story were made with some care and nuance. The problem with cannibalism as it's used here is not merely that it's gross and disgusting and off-putting to the reader, although it is all of those things. If I wanted to read horror, I would read horror novels. I don't appreciate surprise horror used for shock value in regular fantasy. But worse, it's an abandonment of moral nuance. The function of cannibalism in this story is like the function of Satanic baby-killers: it's to signal that these people are wholly and irredeemably evil. They are the Villains, they are Wrong, and they cease to be characters and become symbols of what the protagonists are fighting. This is destructive to the story because it's designed to provoke a visceral short-circuit in the reader and let the author get away with sloppy story-telling. If the author needs to use tactics like this to point out who is the villain, they have failed to set up their moral quandary properly. The worst part is that this was entirely unnecessary because Lawrence's story-telling wasn't sloppy and he set up his moral quandary just fine. No one was confused about the ethical point here. I as the reader was following without difficulty, and had appreciated the subtlety with which Lawrence posed the question. But apparently he thought he was too subtle and decided to come back to the point with a pile-driver. I think that seriously injured the story. The ethical argument here is much more engaging and thought-provoking when it's more finely balanced. That's a lot of complaints, mostly because this is a good book that I badly wanted to be a great book but which kept tripping over its own feet. A lot of trilogies have weak second books. Hopefully this is another example of the mid-story sag, and the finale will be worthy of the start of the story. But I have to admit the moral short-circuiting and the de-emphasis of the actual books in the library has me a bit nervous. I want a lot out of the third book, and I hope I'm not asking this author for too much. If you liked the first book, I think you'll like this one too, with the caveat that it's quite a bit darker and more violent in places, even apart from the surprise cannibalism. But if you've not started this series, you may want to wait for the third book to see if Lawrence can pull off the ending. Followed by The Book That Held Her Heart, currently scheduled for publication in April of 2025. Rating: 7 out of 10

8 September 2024

Jacob Adams: Linux's Bedtime Routine

How does Linux move from an awake machine to a hibernating one? How does it then manage to restore all state? These questions led me to read way too much C in trying to figure out how this particular hardware/software boundary is navigated. This investigation will be split into a few parts, with the first one going from invocation of hibernation to synchronizing all filesystems to disk. This article has been written using Linux version 6.9.9, the source of which can be found in many places, but can be navigated easily through the Bootlin Elixir Cross-Referencer: https://elixir.bootlin.com/linux/v6.9.9/source Each code snippet will begin with a link to the above giving the file path and the line number of the beginning of the snippet.

A Starting Point for Investigation: /sys/power/state and /sys/power/disk These two system files exist to allow debugging of hibernation, and thus control the exact state used directly. Writing specific values to the state file controls the exact sleep mode used and disk controls the specific hibernation mode1. This is extremely handy as an entry point to understand how these systems work, since we can just follow what happens when they are written to.

Show and Store Functions These two files are defined using the power_attr macro: kernel/power/power.h:80
#define power_attr(_name) \
static struct kobj_attribute _name##_attr =     \
    .attr   =               \
        .name = __stringify(_name), \
        .mode = 0644,           \
     ,                  \
    .show   = _name##_show,         \
    .store  = _name##_store,        \
 
show is called on reads and store on writes. state_show is a little boring for our purposes, as it just prints all the available sleep states. kernel/power/main.c:657
/*
 * state - control system sleep states.
 *
 * show() returns available sleep state labels, which may be "mem", "standby",
 * "freeze" and "disk" (hibernation).
 * See Documentation/admin-guide/pm/sleep-states.rst for a description of
 * what they mean.
 *
 * store() accepts one of those strings, translates it into the proper
 * enumerated value, and initiates a suspend transition.
 */
static ssize_t state_show(struct kobject *kobj, struct kobj_attribute *attr,
			  char *buf)
 
	char *s = buf;
#ifdef CONFIG_SUSPEND
	suspend_state_t i;
	for (i = PM_SUSPEND_MIN; i < PM_SUSPEND_MAX; i++)
		if (pm_states[i])
			s += sprintf(s,"%s ", pm_states[i]);
#endif
	if (hibernation_available())
		s += sprintf(s, "disk ");
	if (s != buf)
		/* convert the last space to a newline */
		*(s-1) = '\n';
	return (s - buf);
 
state_store, however, provides our entry point. If the string disk is written to the state file, it calls hibernate(). This is our entry point. kernel/power/main.c:715
static ssize_t state_store(struct kobject *kobj, struct kobj_attribute *attr,
			   const char *buf, size_t n)
 
	suspend_state_t state;
	int error;
	error = pm_autosleep_lock();
	if (error)
		return error;
	if (pm_autosleep_state() > PM_SUSPEND_ON)  
		error = -EBUSY;
		goto out;
	 
	state = decode_state(buf, n);
	if (state < PM_SUSPEND_MAX)  
		if (state == PM_SUSPEND_MEM)
			state = mem_sleep_current;
		error = pm_suspend(state);
	  else if (state == PM_SUSPEND_MAX)  
		error = hibernate();
	  else  
		error = -EINVAL;
	 
 out:
	pm_autosleep_unlock();
	return error ? error : n;
 
kernel/power/main.c:688
static suspend_state_t decode_state(const char *buf, size_t n)
 
#ifdef CONFIG_SUSPEND
	suspend_state_t state;
#endif
	char *p;
	int len;
	p = memchr(buf, '\n', n);
	len = p ? p - buf : n;
	/* Check hibernation first. */
	if (len == 4 && str_has_prefix(buf, "disk"))
		return PM_SUSPEND_MAX;
#ifdef CONFIG_SUSPEND
	for (state = PM_SUSPEND_MIN; state < PM_SUSPEND_MAX; state++)  
		const char *label = pm_states[state];
		if (label && len == strlen(label) && !strncmp(buf, label, len))
			return state;
	 
#endif
	return PM_SUSPEND_ON;
 
Could we have figured this out just via function names? Sure, but this way we know for sure that nothing else is happening before this function is called.

Autosleep Our first detour is into the autosleep system. When checking the state above, you may notice that the kernel grabs the pm_autosleep_lock before checking the current state. autosleep is a mechanism originally from Android that sends the entire system to either suspend or hibernate whenever it is not actively working on anything. This is not enabled for most desktop configurations, since it s primarily for mobile systems and inverts the standard suspend and hibernate interactions. This system is implemented as a workqueue2 that checks the current number of wakeup events, processes and drivers that need to run3, and if there aren t any, then the system is put into the autosleep state, typically suspend. However, it could be hibernate if configured that way via /sys/power/autosleep in a similar manner to using /sys/power/state to manually enable hibernation. kernel/power/main.c:841
static ssize_t autosleep_store(struct kobject *kobj,
			       struct kobj_attribute *attr,
			       const char *buf, size_t n)
 
	suspend_state_t state = decode_state(buf, n);
	int error;
	if (state == PM_SUSPEND_ON
	    && strcmp(buf, "off") && strcmp(buf, "off\n"))
		return -EINVAL;
	if (state == PM_SUSPEND_MEM)
		state = mem_sleep_current;
	error = pm_autosleep_set_state(state);
	return error ? error : n;
 
power_attr(autosleep);
#endif /* CONFIG_PM_AUTOSLEEP */
kernel/power/autosleep.c:24
static DEFINE_MUTEX(autosleep_lock);
static struct wakeup_source *autosleep_ws;
static void try_to_suspend(struct work_struct *work)
 
	unsigned int initial_count, final_count;
	if (!pm_get_wakeup_count(&initial_count, true))
		goto out;
	mutex_lock(&autosleep_lock);
	if (!pm_save_wakeup_count(initial_count)  
		system_state != SYSTEM_RUNNING)  
		mutex_unlock(&autosleep_lock);
		goto out;
	 
	if (autosleep_state == PM_SUSPEND_ON)  
		mutex_unlock(&autosleep_lock);
		return;
	 
	if (autosleep_state >= PM_SUSPEND_MAX)
		hibernate();
	else
		pm_suspend(autosleep_state);
	mutex_unlock(&autosleep_lock);
	if (!pm_get_wakeup_count(&final_count, false))
		goto out;
	/*
	 * If the wakeup occurred for an unknown reason, wait to prevent the
	 * system from trying to suspend and waking up in a tight loop.
	 */
	if (final_count == initial_count)
		schedule_timeout_uninterruptible(HZ / 2);
 out:
	queue_up_suspend_work();
 
static DECLARE_WORK(suspend_work, try_to_suspend);
void queue_up_suspend_work(void)
 
	if (autosleep_state > PM_SUSPEND_ON)
		queue_work(autosleep_wq, &suspend_work);
 

The Steps of Hibernation

Hibernation Kernel Config It s important to note that most of the hibernate-specific functions below do nothing unless you ve defined CONFIG_HIBERNATION in your Kconfig4. As an example, hibernate itself is defined as the following if CONFIG_HIBERNATE is not set. include/linux/suspend.h:407
static inline int hibernate(void)   return -ENOSYS;  

Check if Hibernation is Available We begin by confirming that we actually can perform hibernation, via the hibernation_available function. kernel/power/hibernate.c:742
if (!hibernation_available())  
	pm_pr_dbg("Hibernation not available.\n");
	return -EPERM;
 
kernel/power/hibernate.c:92
bool hibernation_available(void)
 
	return nohibernate == 0 &&
		!security_locked_down(LOCKDOWN_HIBERNATION) &&
		!secretmem_active() && !cxl_mem_active();
 
nohibernate is controlled by the kernel command line, it s set via either nohibernate or hibernate=no. security_locked_down is a hook for Linux Security Modules to prevent hibernation. This is used to prevent hibernating to an unencrypted storage device, as specified in the manual page kernel_lockdown(7). Interestingly, either level of lockdown, integrity or confidentiality, locks down hibernation because with the ability to hibernate you can extract bascially anything from memory and even reboot into a modified kernel image. secretmem_active checks whether there is any active use of memfd_secret, and if so it prevents hibernation. memfd_secret returns a file descriptor that can be mapped into a process but is specifically unmapped from the kernel s memory space. Hibernating with memory that not even the kernel is supposed to access would expose that memory to whoever could access the hibernation image. This particular feature of secret memory was apparently controversial, though not as controversial as performance concerns around fragmentation when unmapping kernel memory (which did not end up being a real problem). cxl_mem_active just checks whether any CXL memory is active. A full explanation is provided in the commit introducing this check but there s also a shortened explanation from cxl_mem_probe that sets the relevant flag when initializing a CXL memory device. drivers/cxl/mem.c:186
* The kernel may be operating out of CXL memory on this device,
* there is no spec defined way to determine whether this device
* preserves contents over suspend, and there is no simple way
* to arrange for the suspend image to avoid CXL memory which
* would setup a circular dependency between PCI resume and save
* state restoration.

Check Compression The next check is for whether compression support is enabled, and if so whether the requested algorithm is enabled. kernel/power/hibernate.c:747
/*
 * Query for the compression algorithm support if compression is enabled.
 */
if (!nocompress)  
	strscpy(hib_comp_algo, hibernate_compressor, sizeof(hib_comp_algo));
	if (crypto_has_comp(hib_comp_algo, 0, 0) != 1)  
		pr_err("%s compression is not available\n", hib_comp_algo);
		return -EOPNOTSUPP;
	 
 
The nocompress flag is set via the hibernate command line parameter, setting hibernate=nocompress. If compression is enabled, then hibernate_compressor is copied to hib_comp_algo. This synchronizes the current requested compression setting (hibernate_compressor) with the current compression setting (hib_comp_algo). Both values are character arrays of size CRYPTO_MAX_ALG_NAME (128 in this kernel). kernel/power/hibernate.c:50
static char hibernate_compressor[CRYPTO_MAX_ALG_NAME] = CONFIG_HIBERNATION_DEF_COMP;
/*
 * Compression/decompression algorithm to be used while saving/loading
 * image to/from disk. This would later be used in 'kernel/power/swap.c'
 * to allocate comp streams.
 */
char hib_comp_algo[CRYPTO_MAX_ALG_NAME];
hibernate_compressor defaults to lzo if that algorithm is enabled, otherwise to lz4 if enabled5. It can be overwritten using the hibernate.compressor setting to either lzo or lz4. kernel/power/Kconfig:95
choice
	prompt "Default compressor"
	default HIBERNATION_COMP_LZO
	depends on HIBERNATION
config HIBERNATION_COMP_LZO
	bool "lzo"
	depends on CRYPTO_LZO
config HIBERNATION_COMP_LZ4
	bool "lz4"
	depends on CRYPTO_LZ4
endchoice
config HIBERNATION_DEF_COMP
	string
	default "lzo" if HIBERNATION_COMP_LZO
	default "lz4" if HIBERNATION_COMP_LZ4
	help
	  Default compressor to be used for hibernation.
kernel/power/hibernate.c:1425
static const char * const comp_alg_enabled[] =  
#if IS_ENABLED(CONFIG_CRYPTO_LZO)
	COMPRESSION_ALGO_LZO,
#endif
#if IS_ENABLED(CONFIG_CRYPTO_LZ4)
	COMPRESSION_ALGO_LZ4,
#endif
 ;
static int hibernate_compressor_param_set(const char *compressor,
		const struct kernel_param *kp)
 
	unsigned int sleep_flags;
	int index, ret;
	sleep_flags = lock_system_sleep();
	index = sysfs_match_string(comp_alg_enabled, compressor);
	if (index >= 0)  
		ret = param_set_copystring(comp_alg_enabled[index], kp);
		if (!ret)
			strscpy(hib_comp_algo, comp_alg_enabled[index],
				sizeof(hib_comp_algo));
	  else  
		ret = index;
	 
	unlock_system_sleep(sleep_flags);
	if (ret)
		pr_debug("Cannot set specified compressor %s\n",
			 compressor);
	return ret;
 
static const struct kernel_param_ops hibernate_compressor_param_ops =  
	.set    = hibernate_compressor_param_set,
	.get    = param_get_string,
 ;
static struct kparam_string hibernate_compressor_param_string =  
	.maxlen = sizeof(hibernate_compressor),
	.string = hibernate_compressor,
 ;
We then check whether the requested algorithm is supported via crypto_has_comp. If not, we bail out of the whole operation with EOPNOTSUPP. As part of crypto_has_comp we perform any needed initialization of the algorithm, loading kernel modules and running initialization code as needed6.

Grab Locks The next step is to grab the sleep and hibernation locks via lock_system_sleep and hibernate_acquire. kernel/power/hibernate.c:758
sleep_flags = lock_system_sleep();
/* The snapshot device should not be opened while we're running */
if (!hibernate_acquire())  
	error = -EBUSY;
	goto Unlock;
 
First, lock_system_sleep marks the current thread as not freezable, which will be important later7. It then grabs the system_transistion_mutex, which locks taking snapshots or modifying how they are taken, resuming from a hibernation image, entering any suspend state, or rebooting.

The GFP Mask The kernel also issues a warning if the gfp mask is changed via either pm_restore_gfp_mask or pm_restrict_gfp_mask without holding the system_transistion_mutex. GFP flags tell the kernel how it is permitted to handle a request for memory. include/linux/gfp_types.h:12
 * GFP flags are commonly used throughout Linux to indicate how memory
 * should be allocated.  The GFP acronym stands for get_free_pages(),
 * the underlying memory allocation function.  Not every GFP flag is
 * supported by every function which may allocate memory.
In the case of hibernation specifically we care about the IO and FS flags, which are reclaim operators, ways the system is permitted to attempt to free up memory in order to satisfy a specific request for memory. include/linux/gfp_types.h:176
 * Reclaim modifiers
 * -----------------
 * Please note that all the following flags are only applicable to sleepable
 * allocations (e.g. %GFP_NOWAIT and %GFP_ATOMIC will ignore them).
 *
 * %__GFP_IO can start physical IO.
 *
 * %__GFP_FS can call down to the low-level FS. Clearing the flag avoids the
 * allocator recursing into the filesystem which might already be holding
 * locks.
gfp_allowed_mask sets which flags are permitted to be set at the current time. As the comment below outlines, preventing these flags from being set avoids situations where the kernel needs to do I/O to allocate memory (e.g. read/writing swap8) but the devices it needs to read/write to/from are not currently available. kernel/power/main.c:24
/*
 * The following functions are used by the suspend/hibernate code to temporarily
 * change gfp_allowed_mask in order to avoid using I/O during memory allocations
 * while devices are suspended.  To avoid races with the suspend/hibernate code,
 * they should always be called with system_transition_mutex held
 * (gfp_allowed_mask also should only be modified with system_transition_mutex
 * held, unless the suspend/hibernate code is guaranteed not to run in parallel
 * with that modification).
 */
static gfp_t saved_gfp_mask;
void pm_restore_gfp_mask(void)
 
	WARN_ON(!mutex_is_locked(&system_transition_mutex));
	if (saved_gfp_mask)  
		gfp_allowed_mask = saved_gfp_mask;
		saved_gfp_mask = 0;
	 
 
void pm_restrict_gfp_mask(void)
 
	WARN_ON(!mutex_is_locked(&system_transition_mutex));
	WARN_ON(saved_gfp_mask);
	saved_gfp_mask = gfp_allowed_mask;
	gfp_allowed_mask &= ~(__GFP_IO   __GFP_FS);
 

Sleep Flags After grabbing the system_transition_mutex the kernel then returns and captures the previous state of the threads flags in sleep_flags. This is used later to remove PF_NOFREEZE if it wasn t previously set on the current thread. kernel/power/main.c:52
unsigned int lock_system_sleep(void)
 
	unsigned int flags = current->flags;
	current->flags  = PF_NOFREEZE;
	mutex_lock(&system_transition_mutex);
	return flags;
 
EXPORT_SYMBOL_GPL(lock_system_sleep);
include/linux/sched.h:1633
#define PF_NOFREEZE		0x00008000	/* This thread should not be frozen */
Then we grab the hibernate-specific semaphore to ensure no one can open a snapshot or resume from it while we perform hibernation. Additionally this lock is used to prevent hibernate_quiet_exec, which is used by the nvdimm driver to active its firmware with all processes and devices frozen, ensuring it is the only thing running at that time9. kernel/power/hibernate.c:82
bool hibernate_acquire(void)
 
	return atomic_add_unless(&hibernate_atomic, -1, 0);
 

Prepare Console The kernel next calls pm_prepare_console. This function only does anything if CONFIG_VT_CONSOLE_SLEEP has been set. This prepares the virtual terminal for a suspend state, switching away to a console used only for the suspend state if needed. kernel/power/console.c:130
void pm_prepare_console(void)
 
	if (!pm_vt_switch())
		return;
	orig_fgconsole = vt_move_to_console(SUSPEND_CONSOLE, 1);
	if (orig_fgconsole < 0)
		return;
	orig_kmsg = vt_kmsg_redirect(SUSPEND_CONSOLE);
	return;
 
The first thing is to check whether we actually need to switch the VT kernel/power/console.c:94
/*
 * There are three cases when a VT switch on suspend/resume are required:
 *   1) no driver has indicated a requirement one way or another, so preserve
 *      the old behavior
 *   2) console suspend is disabled, we want to see debug messages across
 *      suspend/resume
 *   3) any registered driver indicates it needs a VT switch
 *
 * If none of these conditions is present, meaning we have at least one driver
 * that doesn't need the switch, and none that do, we can avoid it to make
 * resume look a little prettier (and suspend too, but that's usually hidden,
 * e.g. when closing the lid on a laptop).
 */
static bool pm_vt_switch(void)
 
	struct pm_vt_switch *entry;
	bool ret = true;
	mutex_lock(&vt_switch_mutex);
	if (list_empty(&pm_vt_switch_list))
		goto out;
	if (!console_suspend_enabled)
		goto out;
	list_for_each_entry(entry, &pm_vt_switch_list, head)  
		if (entry->required)
			goto out;
	 
	ret = false;
out:
	mutex_unlock(&vt_switch_mutex);
	return ret;
 
There is an explanation of the conditions under which a switch is performed in the comment above the function, but we ll also walk through the steps here. Firstly we grab the vt_switch_mutex to ensure nothing will modify the list while we re looking at it. We then examine the pm_vt_switch_list. This list is used to indicate the drivers that require a switch during suspend. They register this requirement, or the lack thereof, via pm_vt_switch_required. kernel/power/console.c:31
/**
 * pm_vt_switch_required - indicate VT switch at suspend requirements
 * @dev: device
 * @required: if true, caller needs VT switch at suspend/resume time
 *
 * The different console drivers may or may not require VT switches across
 * suspend/resume, depending on how they handle restoring video state and
 * what may be running.
 *
 * Drivers can indicate support for switchless suspend/resume, which can
 * save time and flicker, by using this routine and passing 'false' as
 * the argument.  If any loaded driver needs VT switching, or the
 * no_console_suspend argument has been passed on the command line, VT
 * switches will occur.
 */
void pm_vt_switch_required(struct device *dev, bool required)
Next, we check console_suspend_enabled. This is set to false by the kernel parameter no_console_suspend, but defaults to true. Finally, if there are any entries in the pm_vt_switch_list, then we check to see if any of them require a VT switch. Only if none of these conditions apply, then we return false. If a VT switch is in fact required, then we move first the currently active virtual terminal/console10 (vt_move_to_console) and then the current location of kernel messages (vt_kmsg_redirect) to the SUSPEND_CONSOLE. The SUSPEND_CONSOLE is the last entry in the list of possible consoles, and appears to just be a black hole to throw away messages. kernel/power/console.c:16
#define SUSPEND_CONSOLE	(MAX_NR_CONSOLES-1)
Interestingly, these are separate functions because you can use TIOCL_SETKMSGREDIRECT (an ioctl11) to send kernel messages to a specific virtual terminal, but by default its the same as the currently active console. The locations of the previously active console and the previous kernel messages location are stored in orig_fgconsole and orig_kmsg, to restore the state of the console and kernel messages after the machine wakes up again. Interestingly, this means orig_fgconsole also ends up storing any errors, so has to be checked to ensure it s not less than zero before we try to do anything with the kernel messages on both suspend and resume. drivers/tty/vt/vt_ioctl.c:1268
/* Perform a kernel triggered VT switch for suspend/resume */
static int disable_vt_switch;
int vt_move_to_console(unsigned int vt, int alloc)
 
	int prev;
	console_lock();
	/* Graphics mode - up to X */
	if (disable_vt_switch)  
		console_unlock();
		return 0;
	 
	prev = fg_console;
	if (alloc && vc_allocate(vt))  
		/* we can't have a free VC for now. Too bad,
		 * we don't want to mess the screen for now. */
		console_unlock();
		return -ENOSPC;
	 
	if (set_console(vt))  
		/*
		 * We're unable to switch to the SUSPEND_CONSOLE.
		 * Let the calling function know so it can decide
		 * what to do.
		 */
		console_unlock();
		return -EIO;
	 
	console_unlock();
	if (vt_waitactive(vt + 1))  
		pr_debug("Suspend: Can't switch VCs.");
		return -EINTR;
	 
	return prev;
 
Unlike most other locking functions we ve seen so far, console_lock needs to be careful to ensure nothing else is panicking and needs to dump to the console before grabbing the semaphore for the console and setting a couple flags.

Panics Panics are tracked via an atomic integer set to the id of the processor currently panicking. kernel/printk/printk.c:2649
/**
 * console_lock - block the console subsystem from printing
 *
 * Acquires a lock which guarantees that no consoles will
 * be in or enter their write() callback.
 *
 * Can sleep, returns nothing.
 */
void console_lock(void)
 
	might_sleep();
	/* On panic, the console_lock must be left to the panic cpu. */
	while (other_cpu_in_panic())
		msleep(1000);
	down_console_sem();
	console_locked = 1;
	console_may_schedule = 1;
 
EXPORT_SYMBOL(console_lock);
kernel/printk/printk.c:362
/*
 * Return true if a panic is in progress on a remote CPU.
 *
 * On true, the local CPU should immediately release any printing resources
 * that may be needed by the panic CPU.
 */
bool other_cpu_in_panic(void)
 
	return (panic_in_progress() && !this_cpu_in_panic());
 
kernel/printk/printk.c:345
static bool panic_in_progress(void)
 
	return unlikely(atomic_read(&panic_cpu) != PANIC_CPU_INVALID);
 
kernel/printk/printk.c:350
/* Return true if a panic is in progress on the current CPU. */
bool this_cpu_in_panic(void)
 
	/*
	 * We can use raw_smp_processor_id() here because it is impossible for
	 * the task to be migrated to the panic_cpu, or away from it. If
	 * panic_cpu has already been set, and we're not currently executing on
	 * that CPU, then we never will be.
	 */
	return unlikely(atomic_read(&panic_cpu) == raw_smp_processor_id());
 
console_locked is a debug value, used to indicate that the lock should be held, and our first indication that this whole virtual terminal system is more complex than might initially be expected. kernel/printk/printk.c:373
/*
 * This is used for debugging the mess that is the VT code by
 * keeping track if we have the console semaphore held. It's
 * definitely not the perfect debug tool (we don't know if _WE_
 * hold it and are racing, but it helps tracking those weird code
 * paths in the console code where we end up in places I want
 * locked without the console semaphore held).
 */
static int console_locked;
console_may_schedule is used to see if we are permitted to sleep and schedule other work while we hold this lock. As we ll see later, the virtual terminal subsystem is not re-entrant, so there s all sorts of hacks in here to ensure we don t leave important code sections that can t be safely resumed.

Disable VT Switch As the comment below lays out, when another program is handling graphical display anyway, there s no need to do any of this, so the kernel provides a switch to turn the whole thing off. Interestingly, this appears to only be used by three drivers, so the specific hardware support required must not be particularly common.
drivers/gpu/drm/omapdrm/dss
drivers/video/fbdev/geode
drivers/video/fbdev/omap2
drivers/tty/vt/vt_ioctl.c:1308
/*
 * Normally during a suspend, we allocate a new console and switch to it.
 * When we resume, we switch back to the original console.  This switch
 * can be slow, so on systems where the framebuffer can handle restoration
 * of video registers anyways, there's little point in doing the console
 * switch.  This function allows you to disable it by passing it '0'.
 */
void pm_set_vt_switch(int do_switch)
 
	console_lock();
	disable_vt_switch = !do_switch;
	console_unlock();
 
EXPORT_SYMBOL(pm_set_vt_switch);
The rest of the vt_switch_console function is pretty normal, however, simply allocating space if needed to create the requested virtual terminal and then setting the current virtual terminal via set_console.

Virtual Terminal Set Console With set_console, we begin (as if we haven t been already) to enter the madness that is the virtual terminal subsystem. As mentioned previously, modifications to its state must be made very carefully, as other stuff happening at the same time could create complete messes. All this to say, calling set_console does not actually perform any work to change the state of the current console. Instead it indicates what changes it wants and then schedules that work. drivers/tty/vt/vt.c:3153
int set_console(int nr)
 
	struct vc_data *vc = vc_cons[fg_console].d;
	if (!vc_cons_allocated(nr)   vt_dont_switch  
		(vc->vt_mode.mode == VT_AUTO && vc->vc_mode == KD_GRAPHICS))  
		/*
		 * Console switch will fail in console_callback() or
		 * change_console() so there is no point scheduling
		 * the callback
		 *
		 * Existing set_console() users don't check the return
		 * value so this shouldn't break anything
		 */
		return -EINVAL;
	 
	want_console = nr;
	schedule_console_callback();
	return 0;
 
The check for vc->vc_mode == KD_GRAPHICS is where most end-user graphical desktops will bail out of this change, as they re in graphics mode and don t need to switch away to the suspend console. vt_dont_switch is a flag used by the ioctls11 VT_LOCKSWITCH and VT_UNLOCKSWITCH to prevent the system from switching virtual terminal devices when the user has explicitly locked it. VT_AUTO is a flag indicating that automatic virtual terminal switching is enabled12, and thus deliberate switching to a suspend terminal is not required. However, if you do run your machine from a virtual terminal, then we indicate to the system that we want to change to the requested virtual terminal via the want_console variable and schedule a callback via schedule_console_callback. drivers/tty/vt/vt.c:315
void schedule_console_callback(void)
 
	schedule_work(&console_work);
 
console_work is a workqueue2 that will execute the given task asynchronously.

Console Callback drivers/tty/vt/vt.c:3109
/*
 * This is the console switching callback.
 *
 * Doing console switching in a process context allows
 * us to do the switches asynchronously (needed when we want
 * to switch due to a keyboard interrupt).  Synchronization
 * with other console code and prevention of re-entrancy is
 * ensured with console_lock.
 */
static void console_callback(struct work_struct *ignored)
 
	console_lock();
	if (want_console >= 0)  
		if (want_console != fg_console &&
		    vc_cons_allocated(want_console))  
			hide_cursor(vc_cons[fg_console].d);
			change_console(vc_cons[want_console].d);
			/* we only changed when the console had already
			   been allocated - a new console is not created
			   in an interrupt routine */
		 
		want_console = -1;
	 
...
console_callback first looks to see if there is a console change wanted via want_console and then changes to it if it s not the current console and has been allocated already. We do first remove any cursor state with hide_cursor. drivers/tty/vt/vt.c:841
static void hide_cursor(struct vc_data *vc)
 
	if (vc_is_sel(vc))
		clear_selection();
	vc->vc_sw->con_cursor(vc, false);
	hide_softcursor(vc);
 
A full dive into the tty driver is a task for another time, but this should give a general sense of how this system interacts with hibernation.

Notify Power Management Call Chain kernel/power/hibernate.c:767
pm_notifier_call_chain_robust(PM_HIBERNATION_PREPARE, PM_POST_HIBERNATION)
This will call a chain of power management callbacks, passing first PM_HIBERNATION_PREPARE and then PM_POST_HIBERNATION on startup or on error with another callback. kernel/power/main.c:98
int pm_notifier_call_chain_robust(unsigned long val_up, unsigned long val_down)
 
	int ret;
	ret = blocking_notifier_call_chain_robust(&pm_chain_head, val_up, val_down, NULL);
	return notifier_to_errno(ret);
 
The power management notifier is a blocking notifier chain, which means it has the following properties. include/linux/notifier.h:23
 *	Blocking notifier chains: Chain callbacks run in process context.
 *		Callouts are allowed to block.
The callback chain is a linked list with each entry containing a priority and a function to call. The function technically takes in a data value, but it is always NULL for the power management chain. include/linux/notifier.h:49
struct notifier_block;
typedef	int (*notifier_fn_t)(struct notifier_block *nb,
			unsigned long action, void *data);
struct notifier_block  
	notifier_fn_t notifier_call;
	struct notifier_block __rcu *next;
	int priority;
 ;
The head of the linked list is protected by a read-write semaphore. include/linux/notifier.h:65
struct blocking_notifier_head  
	struct rw_semaphore rwsem;
	struct notifier_block __rcu *head;
 ;
Because it is prioritized, appending to the list requires walking it until an item with lower13 priority is found to insert the current item before. kernel/notifier.c:252
/*
 *	Blocking notifier chain routines.  All access to the chain is
 *	synchronized by an rwsem.
 */
static int __blocking_notifier_chain_register(struct blocking_notifier_head *nh,
					      struct notifier_block *n,
					      bool unique_priority)
 
	int ret;
	/*
	 * This code gets used during boot-up, when task switching is
	 * not yet working and interrupts must remain disabled.  At
	 * such times we must not call down_write().
	 */
	if (unlikely(system_state == SYSTEM_BOOTING))
		return notifier_chain_register(&nh->head, n, unique_priority);
	down_write(&nh->rwsem);
	ret = notifier_chain_register(&nh->head, n, unique_priority);
	up_write(&nh->rwsem);
	return ret;
 
kernel/notifier.c:20
/*
 *	Notifier chain core routines.  The exported routines below
 *	are layered on top of these, with appropriate locking added.
 */
static int notifier_chain_register(struct notifier_block **nl,
				   struct notifier_block *n,
				   bool unique_priority)
 
	while ((*nl) != NULL)  
		if (unlikely((*nl) == n))  
			WARN(1, "notifier callback %ps already registered",
			     n->notifier_call);
			return -EEXIST;
		 
		if (n->priority > (*nl)->priority)
			break;
		if (n->priority == (*nl)->priority && unique_priority)
			return -EBUSY;
		nl = &((*nl)->next);
	 
	n->next = *nl;
	rcu_assign_pointer(*nl, n);
	trace_notifier_register((void *)n->notifier_call);
	return 0;
 
Each callback can return one of a series of options. include/linux/notifier.h:18
#define NOTIFY_DONE		0x0000		/* Don't care */
#define NOTIFY_OK		0x0001		/* Suits me */
#define NOTIFY_STOP_MASK	0x8000		/* Don't call further */
#define NOTIFY_BAD		(NOTIFY_STOP_MASK 0x0002)
						/* Bad/Veto action */
When notifying the chain, if a function returns STOP or BAD then the previous parts of the chain are called again with PM_POST_HIBERNATION14 and an error is returned. kernel/notifier.c:107
/**
 * notifier_call_chain_robust - Inform the registered notifiers about an event
 *                              and rollback on error.
 * @nl:		Pointer to head of the blocking notifier chain
 * @val_up:	Value passed unmodified to the notifier function
 * @val_down:	Value passed unmodified to the notifier function when recovering
 *              from an error on @val_up
 * @v:		Pointer passed unmodified to the notifier function
 *
 * NOTE:	It is important the @nl chain doesn't change between the two
 *		invocations of notifier_call_chain() such that we visit the
 *		exact same notifier callbacks; this rules out any RCU usage.
 *
 * Return:	the return value of the @val_up call.
 */
static int notifier_call_chain_robust(struct notifier_block **nl,
				     unsigned long val_up, unsigned long val_down,
				     void *v)
 
	int ret, nr = 0;
	ret = notifier_call_chain(nl, val_up, v, -1, &nr);
	if (ret & NOTIFY_STOP_MASK)
		notifier_call_chain(nl, val_down, v, nr-1, NULL);
	return ret;
 
Each of these callbacks tends to be quite driver-specific, so we ll cease discussion of this here.

Sync Filesystems The next step is to ensure all filesystems have been synchronized to disk. This is performed via a simple helper function that times how long the full synchronize operation, ksys_sync takes. kernel/power/main.c:69
void ksys_sync_helper(void)
 
	ktime_t start;
	long elapsed_msecs;
	start = ktime_get();
	ksys_sync();
	elapsed_msecs = ktime_to_ms(ktime_sub(ktime_get(), start));
	pr_info("Filesystems sync: %ld.%03ld seconds\n",
		elapsed_msecs / MSEC_PER_SEC, elapsed_msecs % MSEC_PER_SEC);
 
EXPORT_SYMBOL_GPL(ksys_sync_helper);
ksys_sync wakes and instructs a set of flusher threads to write out every filesystem, first their inodes15, then the full filesystem, and then finally all block devices, to ensure all pages are written out to disk. fs/sync.c:87
/*
 * Sync everything. We start by waking flusher threads so that most of
 * writeback runs on all devices in parallel. Then we sync all inodes reliably
 * which effectively also waits for all flusher threads to finish doing
 * writeback. At this point all data is on disk so metadata should be stable
 * and we tell filesystems to sync their metadata via ->sync_fs() calls.
 * Finally, we writeout all block devices because some filesystems (e.g. ext2)
 * just write metadata (such as inodes or bitmaps) to block device page cache
 * and do not sync it on their own in ->sync_fs().
 */
void ksys_sync(void)
 
	int nowait = 0, wait = 1;
	wakeup_flusher_threads(WB_REASON_SYNC);
	iterate_supers(sync_inodes_one_sb, NULL);
	iterate_supers(sync_fs_one_sb, &nowait);
	iterate_supers(sync_fs_one_sb, &wait);
	sync_bdevs(false);
	sync_bdevs(true);
	if (unlikely(laptop_mode))
		laptop_sync_completion();
 
It follows an interesting pattern of using iterate_supers to run both sync_inodes_one_sb and then sync_fs_one_sb on each known filesystem16. It also calls both sync_fs_one_sb and sync_bdevs twice, first without waiting for any operations to complete and then again waiting for completion17. When laptop_mode is enabled the system runs additional filesystem synchronization operations after the specified delay without any writes. mm/page-writeback.c:111
/*
 * Flag that puts the machine in "laptop mode". Doubles as a timeout in jiffies:
 * a full sync is triggered after this time elapses without any disk activity.
 */
int laptop_mode;
EXPORT_SYMBOL(laptop_mode);
However, when running a filesystem synchronization operation, the system will add an additional timer to schedule more writes after the laptop_mode delay. We don t want the state of the system to change at all while performing hibernation, so we cancel those timers. mm/page-writeback.c:2198
/*
 * We're in laptop mode and we've just synced. The sync's writes will have
 * caused another writeback to be scheduled by laptop_io_completion.
 * Nothing needs to be written back anymore, so we unschedule the writeback.
 */
void laptop_sync_completion(void)
 
	struct backing_dev_info *bdi;
	rcu_read_lock();
	list_for_each_entry_rcu(bdi, &bdi_list, bdi_list)
		del_timer(&bdi->laptop_mode_wb_timer);
	rcu_read_unlock();
 
As a side note, the ksys_sync function is simply called when the system call sync is used. fs/sync.c:111
SYSCALL_DEFINE0(sync)
 
	ksys_sync();
	return 0;
 

The End of Preparation With that the system has finished preparations for hibernation. This is a somewhat arbitrary cutoff, but next the system will begin a full freeze of userspace to then dump memory out to an image and finally to perform hibernation. All this will be covered in future articles!
  1. Hibernation modes are outside of scope for this article, see the previous article for a high-level description of the different types of hibernation.
  2. Workqueues are a mechanism for running asynchronous tasks. A full description of them is a task for another time, but the kernel documentation on them is available here: https://www.kernel.org/doc/html/v6.9/core-api/workqueue.html 2
  3. This is a bit of an oversimplification, but since this isn t the main focus of this article this description has been kept to a higher level.
  4. Kconfig is Linux s build configuration system that sets many different macros to enable/disable various features.
  5. Kconfig defaults to the first default found
  6. Including checking whether the algorithm is larval? Which appears to indicate that it requires additional setup, but is an interesting choice of name for such a state.
  7. Specifically when we get to process freezing, which we ll get to in the next article in this series.
  8. Swap space is outside the scope of this article, but in short it is a buffer on disk that the kernel uses to store memory not current in use to free up space for other things. See Swap Management for more details.
  9. The code for this is lengthy and tangential, thus it has not been included here. If you re curious about the details of this, see kernel/power/hibernate.c:858 for the details of hibernate_quiet_exec, and drivers/nvdimm/core.c:451 for how it is used in nvdimm.
  10. Annoyingly this code appears to use the terms console and virtual terminal interchangeably.
  11. ioctls are special device-specific I/O operations that permit performing actions outside of the standard file interactions of read/write/seek/etc. 2
  12. I m not entirely clear on how this flag works, this subsystem is particularly complex.
  13. In this case a higher number is higher priority.
  14. Or whatever the caller passes as val_down, but in this case we re specifically looking at how this is used in hibernation.
  15. An inode refers to a particular file or directory within the filesystem. See Wikipedia for more details.
  16. Each active filesystem is registed with the kernel through a structure known as a superblock, which contains references to all the inodes contained within the filesystem, as well as function pointers to perform the various required operations, like sync.
  17. I m including minimal code in this section, as I m not looking to deep dive into the filesystem code at this time.

Next.

Previous.