Search Results: "nma"

30 May 2025

Utkarsh Gupta: FOSS Activites in May 2025

Here s my 68th monthly but brief update about the activities I ve done in the F/L/OSS world.

Debian
This was my 77th month of actively contributing to Debian. I became a DM in late March 2019 and a DD on Christmas 19! \o/ This month I ve just been sort of MIA, mostly because of a combination of the Canonical engineering sprints in Frankfurt, a bit of vacation in Italy, and then being sick. So didn t really get much done in Debian this month.

Ubuntu
This was my 53rd month of actively contributing to Ubuntu. I joined Canonical to work on Ubuntu full-time back in February 2021. Whilst I can t give a full, detailed list of things I did (there s so much and some of it might not be public yet!), here s a quick TL;DR of what I did:

Debian (E)LTS
Debian Long Term Support (LTS) is a project to extend the lifetime of all Debian stable releases to (at least) 5 years. Debian LTS is not handled by the Debian security team, but by a separate group of volunteers and companies interested in making it a success. And Debian Extended LTS (ELTS) is its sister project, extending support to the buster, stretch, and jessie release (+2 years after LTS support). This was my 68th month as a Debian LTS and 55th month as a Debian ELTS paid contributor.
Due to a combination of the Canonical engineering sprints in Frankfurt, a bit of vacation in Italy, and then being sick, I was barely able to do (E)LTS work. So this month, I worked for only 1.00 hours for LTS and 0 hours for ELTS. I did the following things:
  • [LTS] Attended the hourly LTS meeting on IRC. Summary here.

Until next time.
:wq for today.

25 May 2025

Iustin Pop: Corydalis v2025.21.0 - new features!

I just released yesterday a new version of Corydalis (https://demo.corydalis.io, https://github.com/iustin/corydalis). To me personally, it s a major improvement, since the native (my own) image viewer finally gets zooming, panning, gesture handling, etc. This is table-stakes for an image viewer, but oh well, it took me a long time to implement it, because of multiple things: lack of time, the JS library I was using for gestures was pretty old and unmaintained and it caused more trouble than was helping, etc. The feature is not perfect, and on the demo site there s already a bug since all images are smaller than the screen, and this I didn t test , so double-click to zoom doesn t work: says Already at minimum zoom , but zooming otherwise (+/- on the keyboard, mouse wheel, gesture) works. End-to-end, the major development for this release was done over around two weeks, which is pretty short: I extensively used Claude Sonnet and Grok to unblock myself. Not to write code per se - although there is code written 1:1 by LLMs, but most of the code is weirdly wrong, and I have to either correct it or just use it as a starter and rewrite most of it. But to discuss and unblock, and learn about new things, the current LLMs are very good at. And yet, sometimes even what they re good at, fails hard. I asked for ideas to simplify a piece of code, and it went nowhere, even if there were significant rewrite possibilities. I spent the brain cycles on it, reverse engineered my own code, then simplified. I ll have to write a separate blog post on this In any case, this (zooming) was the last major feature I was missing. There are image viewer libraries, but most of them slow, compared to the bare-bones (well, now not so much anymore) viewer that I use as main viewer. From now on, it will me minor incremental features, mostly around Exif management/handling, etc. Or, well, internal cleanups: extend test coverage, remove use of JQuery in the frontend, etc., there are tons of things to do. Fun fact: I managed to discover a Safari iOS bug. Or at least I think it s a bug, so reported it and curious what ll come out of it. Finally, I still couldn t fix the GitHub actions bug where the git describe doesn t see the just pushed tag, sigh, so the demo site still lists Corydalis v2024.12.0-133-g00edf63 as the version .

11 May 2025

Bits from Debian: Bits from the DPL

Dear Debian community, This is bits from the DPL for April. End of 10 I am sure I was speaking in the interest of the whole project when joining the "End of 10" campaign. Here is what I wrote to the initiators:
Hi Joseph and all drivers of the "End of 10" campaign, On behalf of the entire Debian project, I would like to say that we proudly join your great campaign. We stand with you in promoting Free Software, defending users' freedoms, and protecting our planet by avoiding unnecessary hardware waste. Thank you for leading this important initiative.
Andreas Tille Debian Project Leader
I have some goals I would like to share with you for my second term. Ftpmaster delegation This splits up into tasks that can be done before and after Trixie release. Before Trixie: 1. Reducing Barriers to DFSG Compliance Checks Back in 2002, Debian established a way to distribute cryptographic software in the main archive, whereas such software had previously been restricted to the non-US archive. One result of this arrangement which influences our workflow is that all packages uploaded to the NEW queue must remain on the server that hosts it. This requirement means that members of the ftpmaster team must log in to that specific machine, where they are limited to a restricted set of tools for reviewing uploaded code. This setup may act as a barrier to participation--particularly for contributors who might otherwise assist with reviewing packages for DFSG compliance. I believe it is time to reassess this limitation and work toward removing such hurdles. In October last year, we had some initial contact with SPI's legal counsel, who noted that US regulations around cryptography have been relaxed somewhat in recent years (as of 2021). This suggests it may now be possible to revisit and potentially revise the conditions under which we manage cryptographic software in the NEW queue. I plan to investigate this further. If you have expertise in software or export control law and are interested in helping with this topic, please get in touch with me. The ultimate goal is to make it easier for more people to contribute to ensuring that code in the NEW queue complies with the DFSG. 2. Discussing Alternatives My chances to reach out to other distributions remained limited. However, regarding the processing of new software, I learned that OpenSUSE uses a Git-based workflow that requires five "LGTM" approvals from a group of trusted developers. As far as I know, Fedora follows a similar approach. Inspired by this, a recent community initiative--the Gateway to NEW project--enables peer review of new packages for DFSG compliance before they enter the NEW queue. This effort allows anyone to contribute by reviewing packages and flagging potential issues in advance via Git. I particularly appreciate that the DFSG review is coupled with CI, allowing for both license and technical evaluation. While this process currently results in some duplication of work--since final reviews are still performed by the ftpmaster team--it offers a valuable opportunity to catch issues early and improve the overall quality of uploads. If the community sees long-term value in this approach, it could serve as a basis for evolving our workflows. Integrating it more closely into DAK could streamline the process, and we've recently seen that merge requests reflecting community suggestions can be accepted promptly. For now, I would like to gather opinions about how such initiatives could best complement the current NEW processing, and whether greater consensus on trusted peer review could help reduce the burden on the team doing DFSG compliance checks. Submitting packages for review and automated testing before uploading can improve quality and encourage broader participation in safeguarding Debian's Free Software principles. My explicit thanks go out to the Gateway to NEW team for their valuable and forward-looking contribution to Debian. 3. Documenting Critical Workflows Past ftpmaster trainees have told me that understanding the full set of ftpmaster workflows can be quite difficult. While there is some useful documentation thanks in particular to Sean Whitton for his work on documenting NEW processing rules many other important tasks carried out by the ftpmaster team remain undocumented or only partially so. Comprehensive and accessible documentation would greatly benefit current and future team members, especially those onboarding or assisting in specific workflows. It would also help ensure continuity and transparency in how critical parts of the archive are managed. If such documentation already exists and I have simply overlooked it, I would be happy to be corrected. Otherwise, I believe this is an area where we need to improve significantly. Volunteers with a talent for writing technical documentation are warmly invited to contact me--I'd be happy to help establish connections with ftpmaster team members who are willing to share their knowledge so that it can be written down and preserved. Once Trixie is released (hopefully before DebConf): 4. Split of the Ftpmaster Team into DFSG and Archive Teams As discussed during the "Meet the ftpteam" BoF at DebConf24, I would like to propose a structural refinement of the current Ftpmaster team by introducing two different delegated teams:
  1. DFSG Team
  2. Archive Team (responsible for DAK maintenance and process tooling, including releases)
(Alternative name suggestions are, of course, welcome.) The primary task of the DFSG team would be the processing of the NEW queue and ensuring that packages comply with the DFSG. The Archive team would focus on maintaining DAK and handling the technical aspects of archive management. I am aware that, in the recent past, the ftpmaster team has decided not to actively seek new members. While I respect the autonomy of each team, the resulting lack of a recruitment pipeline has led to some friction and concern within the wider community, including myself. As Debian Project Leader, it is my responsibility to ensure the long-term sustainability and resilience of our project, which includes fostering an environment where new contributors can join and existing teams remain effective and well-supported. Therefore, even if the current team does not prioritize recruitment, I will actively seek and encourage new contributors for both teams, with the aim of supporting openness and collaboration. This proposal is not intended as criticism of the current team's dedication or achievements--on the contrary, I am grateful for the hard work and commitment shown, often under challenging circumstances. My intention is to help address the structural issues that have made onboarding and specialization difficult and to ensure that both teams are well-supported for the future. I also believe that both teams should regularly inform the Debian community about the policies and procedures they apply. I welcome any suggestions for a more detailed description of the tasks involved, as well as feedback on how best to implement this change in a way that supports collaboration and transparency. My intention with this proposal is to foster a more open and effective working environment, and I am committed to working with all involved to ensure that any changes are made collaboratively and with respect for the important work already being done. I'm aware that the ideas outlined above touch on core parts of how Debian operates and involve responsibilities across multiple teams. These are not small changes, and implementing them will require thoughtful discussion and collaboration. To move this forward, I've registered a dedicated BoF for DebConf. To make the most of that opportunity, I'm looking for volunteers who feel committed to improving our workflows and processes. With your help, we can prepare concrete and sensible proposals in advance--so the limited time of the BoF can be used effectively for decision-making and consensus-building. In short: I need your help to bring these changes to life. From my experience in my last term, I know that when it truly matters, the Debian community comes together--and I trust that spirit will guide us again. Please also note: we had a "Call for volunteers" five years ago, and much of what was written there still holds true today. I've been told that the response back then was overwhelming--but that training such a large number of volunteers didn't scale well. This time, I hope we can find a more sustainable approach: training a few dedicated people first, and then enabling them to pass on their knowledge. This will also be a topic at the DebCamp sprint. Dealing with Dormant Packages Debian was founded on the principle that each piece of software should be maintained by someone with expertise in it--typically a single, responsible maintainer. This model formed the historical foundation of Debian's packaging system and helped establish high standards of quality and accountability. However, as the project has grown and the number of packages has expanded, this model no longer scales well in all areas. Team maintenance has since emerged as a practical complement, allowing multiple contributors to share responsibility and reduce bottlenecks--depending on each team's internal policy. While working on the Bug of the Day initiative, I observed a significant number of packages that have not been updated in a long time. In the case of team-maintained packages, addressing this is often straightforward: team uploads can be made, or the team can be asked whether the package should be removed. We've also identified many packages that would fit well under the umbrella of active teams, such as language teams like Debian Perl and Debian Python, or blends like Debian Games and Debian Multimedia. Often, no one has taken action--not because of disagreement, but simply due to inattention or a lack of initiative. In addition, we've found several packages that probably should be removed entirely. In those cases, we've filed bugs with pre-removal warnings, which can later be escalated to removal requests. When a package is still formally maintained by an individual, but shows signs of neglect (e.g., no uploads for years, unfixed RC bugs, failing autopkgtests), we currently have three main tools:
  1. The MIA process, which handles inactive or unreachable maintainers.
  2. Package Salvaging, which allows contributors to take over maintenance if conditions are met.
  3. Non-Maintainer Uploads (NMUs), which are limited to specific, well-defined fixes (which do not include things like migration to Salsa).
These mechanisms are important and valuable, but they don't always allow us to react swiftly or comprehensively enough. Our tools for identifying packages that are effectively unmaintained are relatively weak, and the thresholds for taking action are often high. The Package Salvage team is currently trialing a process we've provisionally called "Intend to NMU" (ITN). The name is admittedly questionable--some have suggested alternatives like "Intent to Orphan"--and discussion about this is ongoing on debian-devel. The mechanism is intended for situations where packages appear inactive but aren't yet formally orphaned, introducing a clear 21-day notice period before NMUs, similar in spirit to the existing ITS process. The discussion has sparked suggestions for expanding NMU rules. While it is crucial not to undermine the autonomy of maintainers who remain actively involved, we also must not allow a strict interpretation of this autonomy to block needed improvements to obviously neglected packages. To be clear: I do not propose to change the rights of maintainers who are clearly active and invested in their packages. That model has served us well. However, we must also be honest that, in some cases, maintainers stop contributing--quietly and without transition plans. In those situations, we need more agile and scalable procedures to uphold Debian's high standards. To that end, I've registered a BoF session for DebConf25 to discuss potential improvements in how we handle dormant packages. These discussions will be prepared during a sprint at DebCamp, where I hope to work with others on concrete ideas. Among the topics I want to revisit is my proposal from last November on debian-devel, titled "Barriers between packages and other people". While the thread prompted substantial discussion, it understandably didn't lead to consensus. I intend to ensure the various viewpoints are fairly summarised--ideally by someone with a more neutral stance than myself--and, if possible, work toward a formal proposal during the DebCamp sprint to present at the DebConf BoF. My hope is that we can agree on mechanisms that allow us to act more effectively in situations where formerly very active volunteers have, for whatever reason, moved on. That way, we can protect both Debian's quality and its collaborative spirit. Building Sustainable Funding for Debian Debian incurs ongoing expenses to support its infrastructure--particularly hardware maintenance and upgrades--as well as to fund in-person meetings like sprints and mini-DebConfs. These investments are essential to our continued success: they enable productive collaboration and ensure the robustness of the operating system we provide to users and derivative distributions around the world. While DebConf benefits from generous sponsorship, and we regularly receive donated hardware, there is still considerable room to grow our financial base--especially to support less visible but equally critical activities. One key goal is to establish a more constant and predictable stream of income, helping Debian plan ahead and respond more flexibly to emerging needs. This presents an excellent opportunity for contributors who may not be involved in packaging or technical development. Many of us in Debian are engineers first--and fundraising is not something we've been trained to do. But just like technical work, building sustainable funding requires expertise and long-term engagement. If you're someone who's passionate about Free Software and has experience with fundraising, donor outreach, sponsorship acquisition, or nonprofit development strategy, we would deeply value your help. Supporting Debian doesn't have to mean writing code. Helping us build a steady and reliable financial foundation is just as important--and could make a lasting impact. Kind regards Andreas. PS: In April I also planted my 5000th tree and while this is off-topic here I'm proud to share this information with my fellow Debian friends.

1 May 2025

Ian Jackson: Free Software, internal politics, and governance

There is a thread of opinion in some Free Software communities, that we shouldn t be doing politics , and instead should just focus on technology. But that s impossible. This approach is naive, harmful, and, ultimately, self-defeating, even on its own narrow terms. Today I m talking about small-p politics In this article I m using politics in the very wide sense: us humans managing our disagreements with each other. I m not going to talk about culture wars, woke, racism, trans rights, and so on. I am not going to talk about how Free Software has always had explicitly political goals; or how it s impossible to be neutral because choosing not to take a stand is itself to take a stand. Those issues are all are important and Free Software definitely must engage with them. Many of the points I make are applicable there too. But those are not my focus today. Today I m talking in more general terms about politics, power, and governance. Many people working together always entails politics Computers are incredibly complicated nowadays. Making software is a joint enterprise. Even if an individual program has only a single maintainer, it fits into an ecosystem of other software, maintained by countless other developers. Larger projects can have thousands of maintainers and hundreds of thousands of contributors. Humans don t always agree about everything. This is natural. Indeed, it s healthy: to write the best code, we need a wide range of knowledge and experience. When we can t come to agreement, we need a way to deal with that: a way that lets us still make progress, but also leaves us able to work together afterwards. A way that feels OK for everyone. Providing a framework for disagreement is the job of a governance system. The rules say which people make which decisions, who must be consulted, how the decisions are made, and, how, if any, they can be reviewed. This is all politics. Consensus is great but always requiring it is harmful Ideally a discussion will converge to a synthesis that satisfies everyone, or at least a consensus. When consensus can t be achieved, we can hope for compromise: something everyone can live with. Compromise is achieved through negotiation. If every decision requires consensus, then the proponents of any wide-ranging improvement have an almost insurmountable hurdle: those who are favoured by the status quo and find it convenient can always object. So there will never be consensus for change. If there is any objection at all, no matter how ill-founded, the status quo will always win. This is where governance comes in. Governance is like backups: we need to practice it Governance processes are the backstop for when discussions, and then negotiations, fail, and people still don t see eye to eye. In a healthy community, everyone needs to know how the governance works and what the rules are. The participants need to accept the system s legitimacy. Everyone, including the losing side, must be prepared to accept and implement (or, at least not obstruct) whatever the decision is, and hopefully live with it and stay around. That means we need to practice our governance processes. We can t just leave them for the day we have a huge and controversial decision to make. If we do that, then when it comes to the crunch we ll have toxic rows where no-one can agree the rules; where determined people bend the rules to fit their outcome; and where afterwards people feel like the whole thing was horrible and unfair. So our decisionmaking bodies and roles need to be making decisions, as a matter of routine, and we need to get used to that. First-line decisionmaking bodies should be making decisions frequently. Last-line appeal mechanisms (large-scale votes, for example) are naturally going to be exercised more rarely, but they must happen, be seen as legitimate, and their outcomes must be implemented in full. Governance should usually be routine and boring When governance is working well it s quite boring. People offer their input, and are heard. Angles are debated, and concerns are addressed. If agreement still isn t reached, the committee, or elected leader, makes a decision. Hopefully everyone thinks the leadership is legitimate, and that it properly considered and heard their arguments, and made the decision for good reasons. Hopefully the losing side can still get their work done (and make their own computer work the way they want); so while they will be disappointed, they can live with the outcome. Many human institutions manage this most of the time. It does take some knowledge about principles of governance, and ideally some experience. Governance means deciding, not just mediating By making decisions I mean exercising their authority to rule on an actual disagreement: one that wasn t resolved by debate or negotiation. Governance processes by definition involve deciding, not just mediating. It s not governance if we re advising or cajoling: in that case, we re back to demanding consensus. Governance is necessary precisely when consensus is not achieved. If the governance systems are to mean anything, they must be able to (over)rule; that means (over)ruling must be normal and accepted. Otherwise, when the we need to overrule, we ll find that we can t, because we lack the collective practice. To be legitimate (and seen as legitimate) decisions must usually be made based on the merits, not on participants status, and not only on process questions. On the autonomy of the programmer Many programmers seem to find the very concept of governance, and binding decisionmaking, deeply uncomfortable. Ultimately, it means sometimes overruling someone s technical decision. As programmers and maintainers we naturally see how this erodes our autonomy. But we have all seen projects where the maintainers are unpleasant, obstinate, or destructive. We have all found this frustrating. Software is all interconnected, and one programmer s bad decisions can cause problems for many of the rest of us. We exasperate, why won t they just do the right thing . This is futile. People have never just ed and they re not going to start just ing now. So often the boot is on the other foot. More broadly, as software developers, we have a responsibility to our users, and a duty to write code that does good rather than ill in the world. We ought to be accountable. (And not just to capitalist bosses!) Governance mechanisms are the answer. (No, forking anything but the smallest project is very rarely a practical answer.) Mitigate the consequences of decisions retain flexibility In software, it is often possible to soften the bad social effects of a controversial decision, by retaining flexibility. With a bit of extra work, we can often provide hooks, non-default configuration options, or plugin arrangements. If we can convert the question from how will the software always behave into merely what should the default be , we can often save ourselves a lot of drama. So it is often worth keeping even suboptimal or untidy features or options, if people want to use them and are willing to maintain them. There is a tradeoff here, of course. But Free Software projects often significantly under-value the social benefits of keeping everyone happy. Wrestling software even crusty or buggy software is a lot more fun than having unpleasant arguments. But don t do decisionmaking like a corporation Many programmers experience of formal decisionmaking is from their boss at work. But corporations are often a very bad example. They typically don t have as much trouble actually making decisions, but the actual decisions are often terrible, and not just because corporations goals are often bad. You get to be a decisionmaker in a corporation by spouting plausible nonsense, sounding confident, buttering up the even-more-vacuous people further up the chain, and sometimes by sabotaging your rivals. Corporate senior managers are hardly ever held accountable typically the effects of their tenure are only properly felt well after they ve left to mess up somewhere else. We should select our leaders more wisely, and base decisions on substance. If you won t do politics, politics will do you As a participant in a project, or a society, you can of course opt out of getting involved in politics. You can opt out of learning how to do politics generally, and opt out of understanding your project s governance structures. You can opt out of making judgements about disputed questions, and tell yourself there s merit on both sides . You can hate politicians indiscriminately, and criticise anyone you see doing politics. If you do this, then you are abdicating your decisionmaking authority, to those who are the most effective manipulators, or the most committed to getting their way. You re tacitly supporting the existing power bases. You re ceding power to the best liars, to those with the least scruples, and to the people who are most motivated by dominance. This is precisely the opposite of what you wanted. If enough people won t do politics, and hate anyone who does, your discussion spaces will be reduced to a battleground of only the hardiest and the most toxic. If you don t see the politics, it s still happening If your governance systems don t work, then there is no effective redress against bad or even malicious decisions. Your roleholders and subteams are unaccountable power centres. Power radically distorts every human relationship, and it takes great strength of character for an unaccountable power centre not to eventually become an unaccountable toxic cabal. So if you have a reasonable sized community, but don t see your formal governance systems working people debating things, votes, leadership making explicit decisions that doesn t mean everything is fine, and all the decisions are great, and there s no politics happening. It just means that most of your community have given up on the official process. It also probably means that some parts of your project have formed toxic and unaccountable cabals. Those who won t put up with that will leave. The same is true if the only governance actions that ever happen are massive drama. That means that only the most determined victim of a bad decision, will even consider using such a process. Conclusions

comment count unavailable comments

30 April 2025

Utkarsh Gupta: FOSS Activites in April 2025

Here s my 67th monthly but brief update about the activities I ve done in the F/L/OSS world.

Debian
This was my 76th month of actively contributing to Debian. I became a DM in late March 2019 and a DD on Christmas 19! \o/ There s a bunch of things I do, both, technical and non-technical. Here s what I did:
  • Updating Matomo to v5.3.1.
  • Lots of bursary stuff for DC25. We rolled out the results for the first batch.
  • Helping Andreas Tille with and around FTP team bits.
  • Mentoring for newcomers.
  • Moderation of -project mailing list.

Ubuntu
This was my 51st month of actively contributing to Ubuntu. I joined Canonical to work on Ubuntu full-time back in February 2021. Whilst I can t give a full, detailed list of things I did (there s so much and some of it might not be public yet!), here s a quick TL;DR of what I did:
  • Released 25.04 Plucky Puffin! \o/
  • Helped open the 25.10 Questing Quokka archive. Let the development begin!
  • Jon, VP of Engineering, asked me to lead the Canonical Release team - that was definitely not something I saw coming. :)
  • We re now doing Ubuntu monthly releases for the devel releases - I ll be the tech lead for the project.
  • Preparing for the May sprints - too many new things and new responsibilities. :)

Debian (E)LTS
Debian Long Term Support (LTS) is a project to extend the lifetime of all Debian stable releases to (at least) 5 years. Debian LTS is not handled by the Debian security team, but by a separate group of volunteers and companies interested in making it a success. And Debian Extended LTS (ELTS) is its sister project, extending support to the stretch and jessie release (+2 years after LTS support). This was my 67th month as a Debian LTS and 54th month as a Debian ELTS paid contributor.
Due to DC25 bursary work, Ubuntu 25.04 release, and other travel bits, I only worked for 2.00 hours for LTS and 4.50 hours for ELTS. I did the following things:
  • [ELTS] Had already backported patches for adminer for the following CVEs:
    • CVE-2023-45195: a SSRF attack.
    • CVE-2023-45196: a denial of service attack.
    • Salsa repository: https://salsa.debian.org/lts-team/packages/adminer.
    • As the same CVEs are affected LTS, we decided to release for LTS first and then for ELTS but since I had no hours for LTS, I decided to do a bit more of testing for ELTS to make sure things don t regress in buster.
    • Will prepare LTS (and also s-p-u, sigh) updates this month and get back to ELTS thereafter.
  • [LTS] Started to prepare the LTS update for adminer for the same CVEs as for ELTS:
    • CVE-2023-45195: a SSRF attack.
    • CVE-2023-45196: a denial of service attack.
    • Haven t fully backported the patch yet but this is what I intend to do for this month (now that I have hours :D).
  • [LTS] Partially attended the LTS meeting on Jitsi. Summary here.
    • Partially because I was fighting SSO auth issues with Jitsi. Looks like there were some upstream issues/activity and it was resulting in gateway crashes but all good now.
    • I was following the running notes and keeping up with things as much as I could. :)

Until next time.
:wq for today.

17 April 2025

Petter Reinholdtsen: Gearing up OpenSnitch for a 1.6.8 release in Trixie

Sadly, the interactive application firewall OpenSnitch have in practice been unmaintained in Debian for a while. A few days ago I decided to do something about it, and today I am happy with the result. This package monitor network traffic going in and out of a Linux machine, and show a popup dialog to the logged in desktop user, asking to approve or deny any new connections. It has proved very valuable in discovering programs calling home, giving me more control of how information leak out of my Linux machine. So far the new version is only available in Debian experimental, but I plan to upload it to unstable as soon as I know it is working on a few more machines, and make sure the new version make it into the next stable release of Debian. The package freeze is approaching, and it is not a lot of time left. If you read this blog post, I hope you can be one of the testers. The new version should be using eBPF on architectures where this is working (amd64 and arm64), and fall back to /proc/ probing where the opensnitch-ebpf-modules package is missing (so far only armhf, a unrelated bug blocks building on riscv64 and s390x). Using eBPF should provide more accurate attribution of packages responsible for network traffic for short lived processes, which some times were unavailable in /proc/ when opensnitch tried to probe for information. I have limited experience with the new version, having used it myself for a day or so. It is easily backportable to Debian 12 Bookworm without code changes, all it need is a simple 'debuild' thanks to the optional build dependencies. Due to a misfeature of llc on armhf, there is no eBPF support available there. I have not investigated the details, nor reported any bug yet, but for some reason -march=bpf is an unknown option on this architecture, causing the build in the ebpf_prog subdirectory build to fail. The package is maintained under the umbrella of Debian Go team, and you can meet the current maintainers on the #debian-golang and #opensnitch IRC channels on irc.debian.org. As usual, if you use Bitcoin and want to show your support of my activities, please send Bitcoin donations to my address 15oWEoG9dUPovwmUL9KWAnYRtNJEkP1u1b.

31 March 2025

Russ Allbery: Review: Ghostdrift

Review: Ghostdrift, by Suzanne Palmer
Series: Finder Chronicles #4
Publisher: DAW
Copyright: May 2024
ISBN: 0-7564-1888-7
Format: Kindle
Pages: 378
Ghostdrift is a science fiction adventure and the fourth (and possibly final) book of the Finder Chronicles. You should definitely read this series in order and not start here, even though the plot of this book would stand alone. Following The Scavenger Door, in which he made enemies even more dramatically than he had in the previous books, Fergus Ferguson has retired to the beach on Coralla to become a tea master and take care of his cat. It's a relaxing, idyllic life and a much-needed total reset. Also, he's bored. The arrival of his alien friend Qai, in some kind of trouble and searching for him, is a complex balance between relief and disappointment. Bas Belos is one of the most notorious pirates of the Barrens. He has someone he wants Fergus to find: his twin sister, who disappeared ten years ago. Fergus has an unmatched reputation for finding things, so Belos kidnapped Qai's partner to coerce her into finding Fergus. It's not an auspicious beginning to a relationship, and Qai was ready to fight once they got her partner back, but Belos makes Fergus an offer of payment that, startlingly, is enough for him to take the job mostly voluntarily. Ghostdrift feels a bit like a return to Finder. Fergus is once again alone among strangers, on an assignment that he's mostly not discussing with others, piecing together clues and navigating tricky social dynamics. I missed his friends, particularly Ignatio, and while there are a few moments with AI ships, they play less of a role. But Fergus is so very good at what he does, and Palmer is so very good at writing it. This continues to be competence porn at its best. Belos's crew thinks Fergus is a pirate recruited from a prison colony, and he quietly sets out to win their trust with a careful balance of self-deprecation and unflappable skill, helped considerably by the hidden gift he acquired in Finder. The character development is subtle, but this feels like a Fergus who understands friendship and other people at a deeper and more satisfying level than the Fergus we first met three books ago. Palmer has a real talent for supporting characters and Ghostdrift is no exception. Belos's crew are criminals and murderers, and Palmer does remind the reader of that occasionally, but they're also humans with complex goals and relationships. Belos has earned their loyalty by being loyal and competent in a rough world where those attributes are rare. The morality of this story reminds me of infiltrating a gang: the existence of the gang is not a good thing, and the things they do are often indefensible, but they are an understandable reaction to a corrupt social system. The cops (in this case, the Alliance) are nearly as bad, as we've learned over the past couple of books, and considerably more insufferable. Fergus balances the ethical complexity in a way that I found satisfyingly nuanced, while quietly insisting on his own moral lines. There is a deep science fiction plot here, possibly the most complex of the series so far. The disappearance of Belos's sister is the tip of an iceberg that leads to novel astrophysics, dangerous aliens, mysterious ruins, and an extended period on a remote and wreck-strewn planet. I groaned a bit when the characters ended up on the planet, since treks across primitive alien terrain with jury-rigged technology are one of my least favorite science fiction tropes, but I need not have worried. Palmer knows what she's doing; the pace of the plot does slow a bit at first, but it quickly picks up again, adding enough new setting and plot complications that I never had a chance to be bored by alien plants. It helps that we get another batch of excellent supporting characters for Fergus to observe and win over. This series is such great science fiction. Each book becomes my new favorite, and Ghostdrift is no exception. The skeleton of its plot is a satisfying science fiction mystery with multiple competing factions, hints of fascinating galactic politics, complicated technological puzzles, and a sense of wonder that reminds me of reading Larry Niven's Known Space series. But the characters are so much better and more memorable than classic SF; compared to Fergus, Niven's Louis Wu barely exists and is readily forgotten as soon as the story is over. Fergus starts as a quiet problem-solver, but so much character depth unfolds over the course of this series. The ending of this book was delightfully consistent with everything we've learned about Fergus, but also the sort of ending that it's hard to imagine the Fergus from Finder knowing how to want. Ghostdrift, like each of the books in this series, reaches a satisfying stand-alone conclusion, but there is no reason within the story for this to be the last of the series. The author's acknowledgments, however, says that this the end. I admit to being disappointed, since I want to read more about Fergus and there are numerous loose ends that could be explored. More importantly, though, I hope Palmer will write more novels in any universe of her choosing so that I can buy and read them. This is fantastic stuff. This review comes too late for the Hugo nominating deadline, but I hope Palmer gets a Best Series nomination for the Finder Chronicles as a whole. She deserves it. Rating: 9 out of 10

30 March 2025

Utkarsh Gupta: FOSS Activites in March 2025

Here s my 66th monthly but brief update about the activities I ve done in the F/L/OSS world.

Debian
This was my 75th month of actively contributing to Debian. I became a DM in late March 2019 and a DD on Christmas 19! \o/ There s a bunch of things I do, both, technical and non-technical. Here s what I did:
  • Updating Rails to v7.2.2.1 for Trixie.
  • Updating Redmine to v6.0.4 for Trixie.
  • Kickstarting the bursary team for DC25.
  • Mentoring for newcomers.
  • Moderation of -project mailing list.

Ubuntu
This was my 50th month of actively contributing to Ubuntu. Now that I joined Canonical to work on Ubuntu full-time, there s a bunch of things I do! \o/ I mostly worked on different things, I guess. I was too lazy to maintain a list of things I worked on so there s no concrete list atm. Maybe I ll get back to this section later or will start to list stuff from the fall, as I was doing before. :D

Debian (E)LTS
Debian Long Term Support (LTS) is a project to extend the lifetime of all Debian stable releases to (at least) 5 years. Debian LTS is not handled by the Debian security team, but by a separate group of volunteers and companies interested in making it a success. And Debian Extended LTS (ELTS) is its sister project, extending support to the stretch and jessie release (+2 years after LTS support). This was my 66th month as a Debian LTS and 53rd month as a Debian ELTS paid contributor.
I worked for 15.00 hours for LTS and 7.50 hours for ELTS. I did the following things:
  • [ELTS] Worked on backporting patches for adminer.
  • [E/LTS] Working on the musl fixes for bullseye. Taking it forward from where it was left off by Chris.
    • Co-ordiating with Santiago to see how to best get the reproducer to test the update.
    • Plan is to reproduce it myself but then reach out to Adrian if that doesn t work out.
    • Also makes sense to upload to LTS first, let it settle there, and then look at ELTS.
  • [LTS] Attended the LTS meeting on IRC. Summary here.
  • [stable] Co-ordinated with the Security team to fix rails in bookworm via 2:6.1.7.10+dfsg-1~deb12u1.
    • Fixes: CVE-2023-28362, CVE-2023-38037, CVE-2024-26144, CVE-2024-28103, CVE-2024-41128, CVE-2024-47887, CVE-2024-47888, CVE-2024-47889, and CVE-2024-54133.
    • Released as DSA 5881-1.
  • [stable] Co-ordinated with the Security team to fix ruby-rack in bookworm via 2.2.13-1~deb12u1.
    • Fixes: CVE-2025-27610, CVE-2025-27111, and CVE-2025-25184.
    • Released as DSA 5886-1.
  • [stable] Partly co-ordinated with the Security team to fix ruby-saml in bookworm.

Until next time.
:wq for today.

5 March 2025

Otto Kek l inen: Will decentralized social media soon go mainstream?

Featured image of post Will decentralized social media soon go mainstream?In today s digital landscape, social media is more than just a communication tool it is the primary medium for global discourse. Heads of state, corporate leaders and cultural influencers now broadcast their statements directly to the world, shaping public opinion in real time. However, the dominance of a few centralized platforms X/Twitter, Facebook and YouTube raises critical concerns about control, censorship and the monopolization of information. Those who control these networks effectively wield significant power over public discourse. In response, a new wave of distributed social media platforms has emerged, each built on different decentralized protocols designed to provide greater autonomy, censorship resistance and user control. While Wikipedia maintains a comprehensive list of distributed social networking software and protocols, it does not cover recent blockchain-based systems, nor does it highlight which have the most potential for mainstream adoption. This post explores the leading decentralized social media platforms and the protocols they are based on: Mastodon (ActivityPub), Bluesky (AT Protocol), Warpcast (Farcaster), Hey (Lens) and Primal (Nostr).

Comparison of architecture and mainstream adoption potential
Protocol Identity System Example Storage model Cost for end users Potential
Mastodon Tied to server domain @ottok@mastodon.social Federated instances Free (some instances charge) High
Bluesky Portable (DID) ottoke.bsky.social Federated instances Free Moderate
Farcaster ENS (Ethereum) @ottok Blockchain + off-chain Small gas fees Moderate
Lens NFT-based (Polygon) @ottok Blockchain + off-chain Small gas fees Niche
Nostr Cryptographic Keys npub16lc6uhqpg6dnqajylkhwuh3j7ynhcnje508tt4v6703w9kjlv9vqzz4z7f Federated instances Free (some instances charge) Niche

1. Mastodon (ActivityPub) Screenshot of Mastodon Mastodon was created in 2016 by Eugen Rochko, a German software developer who sought to provide a decentralized and user-controlled alternative to Twitter. It was built on the ActivityPub protocol, now standardized by W3C Social Web Working Group, to allow users to join independent servers while still communicating across the broader Mastodon network. Mastodon operates on a federated model, where multiple independently run servers communicate via ActivityPub. Each server sets its own moderation policies, leading to a decentralized but fragmented experience. The servers can alternatively be called instances, relays or nodes, depending on what vocabulary a protocol has standardized on.
  • Identity: User identity is tied to the instance where they registered, represented as @username@instance.tld.
  • Storage: Data is stored on individual instances, which federate messages to other instances based on their configurations.
  • Cost: Free to use, but relies on instance operators willing to run the servers.
The protocol defines multiple activities such as:
  • Creating a post
  • Liking
  • Sharing
  • Following
  • Commenting

Example Message in ActivityPub (JSON-LD Format)
json
 
 "@context": "https://www.w3.org/ns/activitystreams",
 "type": "Create",
 "actor": "https://mastodon.social/users/ottok",
 "object":  
 "type": "Note",
 "content": "Hello from #Mastodon!",
 "published": "2025-03-03T12:00:00Z",
 "to": ["https://www.w3.org/ns/activitystreams#Public"]
  
 
Servers communicate across different platforms by publishing activities to their followers or forwarding activities between servers. Standard HTTPS is used between servers for communication, and the messages use JSON-LD for data representation. The WebFinger protocol is used for user discovery. There is however no neat way for home server discovery yet. This means that if you are browsing e.g. Fosstodon and want to follow a user and press Follow, a dialog will pop up asking you to enter your own home server (e.g. mastodon.social) to redirect you there for actually executing the Follow action on with your account. Mastodon is open source under the AGPL at github.com/mastodon/mastodon. Anyone can operate their own instance. It just requires to run your own server and some skills to maintain a Ruby on Rails app with a PostgreSQL database backend, and basic understanding of the protocol to configure federation with other ActivityPub instances.

Popularity: Already established, but will it grow more? Mastodon has seen steady growth, especially after Twitter s acquisition in 2022, with some estimates stating it peaked at 10 million users across thousands of instances. However, its fragmented user experience and the complexity of choosing instances have hindered mainstream adoption. Still, it remains the most established decentralized alternative to Twitter. Note that Donald Trump s Truth Social is based on the Mastodon software but does not federate with the ActivityPub network. The ActivityPub protocol is the most widely used of its kind. One of the other most popular services is the Lemmy link sharing service, similar to Reddit. The larger ecosystem of ActivityPub is called Fediverse, and estimates put the total active user count around 6 million.

2. Bluesky (AT Protocol) Screenshot of Bluesky Interestingly, Bluesky was conceived within Twitter in 2019 by Twitter founder Jack Dorsey. After being incubated as a Twitter-funded project, it spun off as an independent Public Benefit LLC in February 2022 and launched its public beta in February 2023. Bluesky runs on top of the Authenticated Transfer (AT) Protocol published at https://github.com/bluesky-social/atproto. The protocol enables portable identities and data ownership, meaning users can migrate between platforms while keeping their identity and content intact. In practice, however, there is only one popular server at the moment, which is Bluesky itself.
  • Identity: Usernames are domain-based (e.g., @user.bsky.social).
  • Storage: Content is theoretically federated among various servers.
  • Cost: Free to use, but relies on instance operators willing to run the servers.

Example Message in AT Protocol (JSON Format)
json
 
 "repo": "did:plc:ottoke.bsky.social",
 "collection": "app.bsky.feed.post",
 "record":  
 "$type": "app.bsky.feed.post",
 "text": "Hello from Bluesky!",
 "createdAt": "2025-03-03T12:00:00Z",
 "langs": ["en"]
  
 

Popularity: Hybrid approach may have business benefits? Bluesky reported over 3 million users by 2024, probably getting traction due to its Twitter-like interface and Jack Dorsey s involvement. Its hybrid approach decentralized identity with centralized components could make it a strong candidate for mainstream adoption, assuming it can scale effectively.

3. Warpcast (Farcaster Network) Farcaster was launched in 2021 by Dan Romero and Varun Srinivasan, both former crypto exchange Coinbase executives, to create a decentralized but user-friendly social network. Built on the Ethereum blockchain, it could potentially offer a very attack-resistant communication medium. However, in my own testing, Farcaster does not seem to fully leverage what Ethereum could offer. First of all, there is no diversity in programs implementing the protocol as at the moment there is only Warpcast. In Warpcast the signup requires an initial 5 USD fee that is not payable in ETH, and users need to create a new wallet address on the Ethereum layer 2 network Base instead of simply reusing their existing Ethereum wallet address or ENS name. Despite this, I can understand why Farcaster may have decided to start out like this. Having a single client program may be the best strategy initially. One of the decentralized chat protocol Matrix founders, Matthew Hodgson, shared in his FOSDEM 2025 talk that he slightly regrets focusing too much on developing the protocol instead of making sure the app to use it is attractive to end users. So it may be sensible to ensure Warpcast gets popular first, before attempting to make the Farcaster protocol widely used. As a protocol Farcaster s hybrid approach makes it more scalable than fully on-chain networks, giving it a higher chance of mainstream adoption if it integrates seamlessly with broader Web3 ecosystems.
  • Identity: ENS (Ethereum Name Service) domains are used as usernames.
  • Storage: Messages are stored in off-chain hubs, while identity is on-chain.
  • Cost: Users must pay gas fees for some operations but reading and posting messages is mostly free.

Example Message in Farcaster (JSON Format)
json
 
 "fid": 766579,
 "username": "ottok",
 "custodyAddress": "0x127853e48be3870172baa4215d63b6d815d18f21",
 "connectedWallet": "0x3ebe43aa3ae5b891ca1577d9c49563c0cee8da88",
 "text": "Hello from Farcaster!",
 "publishedAt": 1709424000,
 "replyTo": null,
 "embeds": []
 

Popularity: Decentralized social media + decentralized payments a winning combo? Ethereum founder Vitalik Buterin (warpcast.com/vbuterin) and many core developers are active on the platform. Warpcast, the main client for Farcaster, has seen increasing adoption, especially among Ethereum developers and Web3 enthusiasts. I too have an profile at warpcast.com/ottok. However, the numbers are still very low and far from reaching network effects to really take off. Blockchain-based social media networks, particularly those built on Ethereum, are compelling because they leverage existing user wallets and persistent identities while enabling native payment functionality. When combined with decentralized content funding through micropayments, these blockchain-backed social networks could offer unique advantages that centralized platforms may find difficult to replicate, being decentralized both as a technical network and in a funding mechanism.

4. Hey.xyz (Lens Network) The Lens Protocol was developed by decentralized finance (DeFi) team Aave and launched in May 2022 to provide a user-owned social media network. While initially built on Polygon, it has since launched its own Layer 2 network called the Lens Network in February 2024. Lens is currently the main competitor to Farcaster. Lens stores profile ownership and references on-chain, while content is stored on IPFS/Arweave, enabling composability with DeFi and NFTs.
  • Identity: Profile ownership is tied to NFTs on the Polygon blockchain.
  • Storage: Content is on-chain and integrates with IPFS/Arweave (like NFTs).
  • Cost: Users must pay gas fees for some operations but reading and posting messages is mostly free.

Example Message in Lens (JSON Format)
json
 
 "profileId": "@ottok",
 "contentURI": "ar://QmExampleHash",
 "collectModule": "0x23b9467334bEb345aAa6fd1545538F3d54436e96",
 "referenceModule": "0x0000000000000000000000000000000000000000",
 "timestamp": 1709558400
 

Popularity: Probably not as social media site, but maybe as protocol? The social media side of Lens is mainly the Hey.xyz website, which seems to have fewer users than Warpcast, and is even further away from reaching critical mass for network effects. The Lens protocol however has a lot of advanced features and it may gain adoption as the building block for many Web3 apps.

5. Primal.net (Nostr Network) Nostr (Notes and Other Stuff Transmitted by Relays) was conceptualized in 2020 by an anonymous developer known as fiatjaf. One of the primary design tenets was to be a censorship-resistant protocol and it is popular among Bitcoin enthusiasts, with Jack Dorsey being one of the public supporters. Unlike the Farcaster and Lens protocols, Nostr is not blockchain-based but just a network of relay servers for message distribution. If does however use public key cryptography for identities, similar to how wallets work in crypto.
  • Identity: Public-private key pairs define identity (with prefix npub...).
  • Storage: Content is federated among multiple servers, which in Nostr vocabulary are called relays.
  • Cost: No gas fees, but relies on relay operators willing to run the servers.

Example Message in Nostr (JSON Format)
json
 
 "id": "note1xyz...",
 "pubkey": "npub1...",
 "kind": 1,
 "content": "Hello from Nostr!",
 "created_at": 1709558400,
 "tags": [],
 "sig": "sig1..."
 

Popularity: If Jack Dorsey and Bitcoiners promote it enough? Primal.net as a web app is pretty solid, but it does not stand out much. While Jack Dorsey has shown support by donating $1.5 million to the protocol development in December 2021, its success likely depends on broader adoption by the Bitcoin community.

Will any of these replace X/Twitter? As usage patterns vary, the statistics are not fully comparable, but this overview of the situation in March 2025 gives a decent overview.
Platform Total Accounts Active Users Growth Trend
Mastodon ~10 million ~1 million Steady
Bluesky ~33 million ~1 million Steady
Nostr ~41 million ~20 thousand Steady
Farcaster ~850 thousand ~50 thousand Flat
Lens ~140 thousand ~20 thousand Flat
Mastodon and Bluesky have already reached millions of users, while Lens and Farcaster are growing within crypto communities. It is however clear that none of these are anywhere close to how popular X/Twitter is. In particular, Mastodon had a huge influx of users in the fall of 2022 when Twitter was acquired, but to challenge the incumbents the growth would need to significantly accelerate. We can all accelerate this development by embracing decentralized social media now alongside existing dominant platforms. Who knows, given the right circumstances maybe X.com leadership decides to change the operating model and start federating contents to break out from a walled garden model. The likelyhood of such development would increase if decentralized networks get popular, and the encumbents feel they need to participate to not lose out.

Past and future The idea of decentralized social media is not new. One early pioneer identi.ca launched in 2008, only two years after Twitter, using the OStatus protocol to promote decentralization. A few years later it evolved into pump.io with the ActivityPump protocol, and also forked into GNU Social that continued with OStatus. I remember when these happened, and that in 2010 also Diaspora launched with fairly large publicity. Surprisingly both of these still operate (I can still post both on identi.ca and diasp.org), but the activity fizzled out years ago. The protocol however survived partially and evolved into ActivityPub, which is now the backbone of the Fediverse. The evolution of decentralized social media over the next decade will likely parallel developments in democracy, freedom of speech and public discourse. While the early 2010s emphasized maximum independence and freedom, the late 2010s saw growing support for content moderation to combat misinformation. The AI era introduces new challenges, potentially requiring proof-of-humanity verification for content authenticity. Key factors that will determine success:
  • User experience and ease of onboarding
  • Network effects and critical mass of users
  • Integration with existing web3 infrastructure
  • Balance between decentralization and usability
  • Sustainable economic models for infrastructure
This is clearly an area of development worth monitoring closely, as the next few years may determine which protocol becomes the de facto standard for decentralized social communication.

2 February 2025

Bits from Debian: Bits from the DPL

Dear Debian community, this is bits from DPL for January. Sovereign Tech Agency I was recently pointed to Technologies and Projects supported by the Sovereign Tech Agency which is financed by the German Federal Ministry for Economic Affairs and Climate Action. It is a subsidiary of the Federal Agency for Disruptive Innovation, SPRIND GmbH. It is worth sending applications there for distinct projects as that is their preferred method of funding. Distinguished developers can also apply for a fellowship position that pays up to 40hrs / week (32hrs when freelancing) for a year. This is esp. open to maintainers of larger numbers of packages in Debian (or any other Linux distribution). There might be a chance that some of the Debian-related projects submitted to the Google Summer of Code that did not get funded could be retried with those foundations. As per the FAQ of the project: "The Sovereign Tech Agency focuses on securing and strengthening open and foundational digital technologies. These communities working on these are distributed all around the world, so we work with people, companies, and FOSS communities everywhere." Similar funding organizations include the Open Technology Fund and FLOSS/fund. If you have a Debian-related project that fits these funding programs, they might be interesting options. This list is by no means exhaustive just some hints I ve received and wanted to share. More suggestions for such opportunities are welcome. Year of code reviews On the debian-devel mailing list, there was a long thread titled "Let's make 2025 a year when code reviews became common in Debian". It initially suggested something along the lines of: "Let's review MRs in Salsa." The discussion quickly expanded to include patches that have been sitting in the BTS for years, which deserve at least the same attention. One idea I'd like to emphasize is that associating BTS bugs with MRs could be very convenient. It s not only helpful for documentation but also the easiest way to apply patches. I d like to emphasize that no matter what workflow we use BTS, MRs, or a mix it is crucial to uphold Debian s reputation for high quality. However, this reputation is at risk as more and more old issues accumulate. While Debian is known for its technical excellence, long-standing bugs and orphaned packages remain a challenge. If we don t address these, we risk weakening the high standards that Debian is valued for. Revisiting old issues and ensuring that unmaintained packages receive attention is especially important as we prepare for the Trixie release. Debian Publicity Team will no longer post on X/Twitter The Press Team has my full support in its decision to stop posting on X. As per the Publicity delegation: the team once decided to join Twitter, but circumstances have since changed. The current Press delegates have the institutional authority to leave X, just as their predecessors had the authority to join. I appreciate that the team carefully considered the matter, reinforced by the arguments developed on the debian-publicity list, and communicated its reasoning openly. Kind regards, Andreas.

Colin Watson: Free software activity in January 2025

Most of my Debian contributions this month were sponsored by Freexian. If you appreciate this sort of work and are at a company that uses Debian, have a look to see whether you can pay for any of Freexian s services; as well as the direct benefits, that revenue stream helps to keep Debian development sustainable for me and several other lovely people. You can also support my work directly via Liberapay. Python team We finally made Python 3.13 the default version in testing! I fixed various bugs that got in the way of this: As with last month, I fixed a few more build regressions due to the removal of a deprecated intersphinx_mapping syntax in Sphinx 8.0: I ported a few packages to Django 5.1: I ported python-pypump to IPython 8.0. I fixed python-datamodel-code-generator to handle isort 6, and contributed that upstream. I fixed some packages to tolerate future versions of dh-python that will drop their dependency on python3-setuptools: I removed the old python-celery-common transitional package from celery, since nothing in Debian needs it any more. I fixed or helped to fix various other build/test failures: I upgraded these packages to new upstream versions: Rust team I fixed rust-pyo3-ffi to avoid explicit Python version dependencies that were getting in the way of making Python 3.13 the default version. Security tools packaging team I uploaded libevt to fix a build failure on i386 and to tolerate future versions of dh-python that will drop their dependency on python3-setuptools. Installer team I helped with some testing of a debian-installer-utils patch as part of the /usr move. I need to get around to uploading this, since it looks OK now. Other small things Helmut Grohne reached out for help debugging a multi-arch coinstallability problem (you know it s going to be complicated when even Helmut can t figure it out on his own ) in binutils, and we had a call about that. I reviewed and applied a new Romanian translation of debconf s manual pages. I did my twice-yearly refresh of debmirror s mirror_size documentation, and applied a contribution to improve the example debmirror.conf. I fixed an arguable preprocessor string handling bug in man-db, and applied a fix for out-of-tree builds.

29 January 2025

Sergio Talens-Oliag: Testing DeepSeek with Ollama and Open WebUI

With all the recent buzz about DeepSeek and its capabilities, I ve decided to give it a try using Ollama and Open WebUI on my work laptop which has an NVIDIA GPU:
$ lspci   grep NVIDIA
0000:01:00.0 3D controller: NVIDIA Corporation GA107GLM [RTX A2000 8GB Laptop GPU]
             (rev a1)
For the installation I initially I looked into the approach suggested on this article, but after reviewing it I decided to go for a docker only approach, as it leaves my system clean and updates are easier.

Step 0: Install dockerI already had it on my machine, so nothing to do here.

Step 1: Install the nvidia-container-toolkit packageAs it is needed to use the NVIDIA GPU with docker I followed the instructions to install the package using apt from the NVIDIA website.

Step 2: Run the Open WebUI container bundled with OllamaI could install ollama directly on linux or run it on docker, but I found out that there is a container with Open WebUI bundled with Ollama, so I decided to use it instead. To start the container I ve executed the following command:
docker run -d \
  -e OLLAMA_HOST="0.0.0.0:11434" -p 127.0.0.1:11434:11434 \
  -p 127.0.0.1:3000:8080 \
  -v ollama:/root/.ollama \
  -v open-webui:/app/backend/data \
  --gpus=all --name open-webui --restart always \
  ghcr.io/open-webui/open-webui:ollama
Once the container is up and running, I can access the Open WebUI on http://localhost:3000/ and the Ollama API on http://localhost:11434/ (I added the OLLAMA_HOST environment variable to be bind the server to all the container IPs, as it is needed to export it to the host).

Step 3: Install DeepSeek-R1 model from Open WebUIAfter creating an admin user on the Open WebUI, I went to the select a model button and asked to download the deepseek-r1 model (at the time of writing this it matches version deepseek-r1:7b, which uses 4.7GB of space). After the download was completed I was able to start chatting with the model on my local machine.

Step 4: Ask a questionThe first question I asked was Can you tell me what is Debian? and the answer I got was:
Debian is one of the most popular Linux distributions used worldwide. It is an
open-source operating system that has been around since 1993 and is known for
its stability, security, and reliability. Debian is widely used in servers, data
centers, and enterprise environments because it provides a robust base platform
with excellent documentation, tools, and community support.
### Key Features of Debian:

1. **History**:
   - Debian started as a project called "Pop!!" (Pootle) led by Lars Knutsen, a
     developer from Denmark.
   - It was initially forked from the Linux distribution Red Hat's RH-Debian to
     gain independence and reduce reliance on Red Hat.

2. **Use Cases**:
   - Debian is commonly used as an operating system for servers due to its
     reliability and ease of use in enterprise environments.
   - It is also used for individual user machines (laptops, desktops) because it
     offers a smooth and stable experience.
   - Debian provides the foundation for many popular derivatives like Ubuntu,
     which are designed for consumer use.

3. **Supported Architectures**:
   - Debian supports all major Linux architectures, including 32-bit and 64-bit
     systems, and runs on a wide range of hardware from old servers to modern
     smartphones.

4. **Package Management**:
   - Debian comes with a robust package management system called **Debian Policy
     Language (DPL)**, which allows users to easily install, update, and manage
     software packages.
   - This makes it easy for users to keep their systems up to date without
     needing to download multiple repositories or deal with version conflicts.

5. **Security Focus**:
   - Debian is known for its strong security practices. It has a reputation for
     being one of the most stable Linux distributions, making it a favorite
     among organizations that prioritize security.
   - The distribution regularly updates its software packages to patch
     vulnerabilities and protect against common attacks.

6. **Compatibility with Servers**:
   - Debian is widely used in data centers because it provides a consistent and
     reliable environment for running servers and applications.
   - Many open-source projects are compatible with Debian, making it an ideal
     choice for developers who want to work on servers or distributed systems.

7. **Popularity**:
   - Debian has a large and active community of users and contributors.
   - It is one of the most widely used Linux distributions in the world, with
     many enterprise organizations relying on it as their primary operating
     system.
### Comparison with Other Distributions:
- **Ubuntu**: Ubuntu is based on Debian but adds graphical user interfaces (like
  Gutsy Gibbon), modern tools (like GIMP and LibreOffice), and integrates with
  Microsoft services.
- **RHEL (Red Hat Enterprise Linux)**: RHEL is a derivative of Red Hat's
  distribution, which forked from Debian. It focuses more heavily on enterprise
  security and support compared to Debian.
### Why Debian?
Debian is popular because:
- It is free and open-source, making it accessible to anyone.
- It provides excellent documentation and tools for system administrators.
- It has a strong focus on stability and reliability.
- It supports a wide range of hardware and software needs.
If you're looking for a reliable and flexible Linux distribution that works well
in both personal and enterprise environments, Debian is an excellent choice.
Haven t tried the same question with ChatGPT, but the History, Package Management and Comparison with Other Distributions sections are hilarious, but they provided me with important data:
  • Now I know that APT is the real Debian Project Leader, what don t know is why we have elections each year ;)
  • Linux Distributions are recursive Debian was a fork of Red Hat, which was a fork of Debian, which was a fork of Red Hat,
As everybody is testing the model I will not talk more about the chat and the results, I just thought that this answer was really funny.

Step 5: Install the DeepSeek Coder and DeepSeek Coder v2 models from Open WebUIAs done before, to download the models I went to the select a model button and asked to download the deepseek-coder and deepseek-coder-v2 models (the default version of version one is said to be really quick and small, while version two is supposed to be better but slower and bigger, so I decided to install both for testing).

Step 6: Integrate Ollama with NeovimSince some months ago I ve been using Github Copilot with Neovim; I don t feel it has been very helpful in the general case, but I wanted to try it and it comes handy when you need to perform repetitive tasks when programming. It seems that there are multiple neovim plugins that support ollama, for now I ve installed and configured the codecompanion plugin on my config.lua file using packer:
require('packer').startup(function()
  [...]
  -- Codecompanion plugin
  use  
    "olimorris/codecompanion.nvim",
    requires =  
      "nvim-lua/plenary.nvim",
      "nvim-treesitter/nvim-treesitter",
     
   
  [...]
end)
[...]
-- --------------------------------
-- BEG: Codecompanion configuration
-- --------------------------------
-- Module setup
local codecompanion = require('codecompanion').setup( 
  adapters =  
    ollama = function()
      return require('codecompanion.adapters').extend('ollama',  
        schema =  
          model =  
            default = 'deepseek-coder-v2:latest',
           
         ,
       )
    end,
   ,
  strategies =  
    chat =   adapter = 'ollama',  ,
    inline =   adapter = 'ollama',  ,
   ,
 )
-- --------------------------------
-- END: Codecompanion configuration
-- --------------------------------
I ve tested it a little bit and it seems to work fine, but I ll have to test it more to see if it is really useful, I ll try to do it on future projects.

ConclusionAt a personal level I don t like nor trust AI systems, but as long as they are treated as tools and not as a magical thing you must trust they have their uses and I m happy to see that open source tools like Ollama and models like DeepSeek available for everyone to use.

5 January 2025

Jonathan McDowell: Free Software Activities for 2024

I tailed off on blog posts towards the end of the year; I blame a bunch of travel (personal + business), catching the flu, then December being its usual busy self. Anyway, to try and start off the year a bit better I thought I d do my annual recap of my Free Software activities. For previous years see 2019, 2020, 2021, 2022 + 2023.

Conferences In 2024 I managed to make it to FOSDEM again. It s a hectic conference, and I know there are legitimate concerns about it being a super spreader event, but it has the advantage of being relatively close and having a lot of different groups of people I want to talk to / see talk at it. I m already booked to go this year as well. I spoke at All Systems Go in Berlin about Using TPMs at scale for protecting keys. It was nice to actually be able to talk publicly about some of the work stuff my team and I have been working on. I d a talk submission in for FOSDEM about our use of attestation and why it s not necessarily the evil some folk claim, but there were a lot of good talks submitted and I wasn t selected. Maybe I ll find somewhere else suitable to do it. BSides Belfast may or may not count - it s a security conference, but there s a lot of overlap with various bits of Free software, so I feel it deserves a mention. I skipped DebConf for 2024 for a variety of reasons, but I m expecting to make DebConf25 in Brest, France in July.

Debian Most of my contributions to Free software continue to happen within Debian. In 2023 I d done a bunch of work on retrogaming with Kodi on Debian, so I made an effort to try and keep those bits more up to date, even if I m not actually regularly using them at present. RetroArch got 1.18.0+dfsg-1 and 1.19.1+dfsg-1 uploads. libretro-core-info got associated 1.18.0-1 and 1.19.0-1 uploads too. I note 1.20.0 has been released recently, so I ll have to find some time to build the appropriate DFSG tarball and update it. rcheevos saw 11.2.0-1, 11.5.0-1 + 11.6.0-1 uploaded. kodi-game-libretro itself had 20.2.7-1 uploaded, then 21.0.7-1. Latest upstream is 22.1.0, but that s tracking Kodi 22 and we re still on Kodi 21 so I plan to follow the Omega branch for now. Which I ve just noticed had a 21.0.8 release this week. Finally in the games space I uploaded mgba 0.10.3+dfsg-1 and 0.10.3+dfsg-2 for Ryan Tandy, before realising he was already a Debian Maintainer and granting him the appropriate ACL access so he can upload it himself; I ve had zero concerns about any of his packaging. The Debian Electronics Packaging Team continues to be home for a bunch of packages I care about. There was nothing big there, for me, in 2024, but a few bits of cleanup here and there. I seem to have become one of the main uploaders for sdcc - I have some interest in the space, and the sigrok firmware requires it to build, so I at least like to ensure it s in half decent state. I uploaded 4.4.0+dfsg-1, 4.4.0+dfsg-2, and, just in time to count for 2024, 4.4.0+dfsg-3. The sdcc 4.4 upload lead to some compilation issues for sigrok-firmware-fx2laf so I uploaded 0.1.7-2 fixing that, then 0.1.7-3 doing some further cleanups. OpenOCD had 0.12.0-2 uploaded to disable the libgpiod backend thanks to incompatible changes upstream. There were some in-discussion patches with OpenOCD upstream at the time, but they didn t seem to be ready yet so I held off on pulling them in. 0.12.0-3 fixed builds with more recent versions of jimtcl. It looks like the next upstream release is about a year away, so Trixie will in all probability ship with 0.12.0 as well. libjaylink had a new upstream release, so 0.4.0-1 was uploaded. libserialsport also had a new upstream release, leading to 0.1.2-1. I finally cracked and uploaded sg3-utils 1.48-1 into experimental. I m not the primary maintainer, but 1.46 is nearly 4 years old now and I wanted to get it updated in enough time to shake out any problems before we get to a Trixie freeze. Outside of team owned packages, libcli had compilation issues with GCC 14, leading to 1.10.7-2. I also added a new package, sedutil 1.20.0-2 back in April; it looks fairly unmaintained upstream (there s been some recent activity, but it doesn t seem to be release quality), but there was an outstanding ITP and I ve some familiarity with the space as we ve been using it at work as part of investigating TCG OPAL encryption. I continue to keep an eye on Debian New Members, even though I m mostly inactive as an application manager - we generally seem to have enough available recently. Mostly my involvement is via Front Desk activities, helping out with queries to the team alias, and contributing to internal discussions. Finally the 3 month rotation for Debian Keyring continues to operate smoothly. I dealt with 2023.03.24, 2023.06.24, 2023.09.22 + 2023.11.24.

Linux I d a single kernel contribution this year, to Clean up TPM space after command failure. That was based on some issues we saw at work. I ve another fix in progress that I hope to submit in 2025, but it s for an intermittent failure so confirming the fix is necessary + sufficient is taking a little while.

Personal projects I didn t end up doing much in the way of externally published personal project work in 2024. Despite the release of OpenPGP v6 in RFC 9580 I did not manage to really work on onak. I started on the v6 support, but have not had sufficient time to complete anything worth pushing external yet. listadmin3 got some minor updates based on external feedback / MRs. It s nice to know it s useful to other folk even in its basic state. That wraps up 2024. I ve got no particular goals for this year at present. Ideally I d get v6 support into onak, and it would be nice to implement some of the wishlist items people have provided for listadmin3, but I ll settle for making sure all my Debian packages are in reasonable state for Trixie.

12 November 2024

Paul Tagliamonte: Complex for Whom?

In basically every engineering organization I ve ever regarded as particularly high functioning, I ve sat through one specific recurring conversation which is not a conversation about complexity . Things are good or bad because they are or aren t complex, architectures needs to be redone because it s too complex some refactor of whatever it is won t work because it s too complex. You may have even been a part of some of these conversations or even been the one advocating for simple light-weight solutions. I ve done it. Many times. Rarely, if ever, do we talk about complexity within its rightful context complexity for whom. Is a solution complex because it s complex for the end user? Is it complex if it s complex for an API consumer? Is it complex if it s complex for the person maintaining the API service? Is it complex if it s complex for someone outside the team maintaining it to understand? Complexity within a problem domain I ve come to believe, is fairly zero-sum there s a fixed amount of complexity in the problem to be solved, and you can choose to either solve it, or leave it for those downstream of you to solve that problem on their own. That being said, while I believe there is a lower bound in complexity to contend with for a problem, I do not believe there is an upper bound to the complexity of solutions possible. It is always possible, and in fact, very likely that teams create problems for themselves while trying to solve a problem. The rest of this post is talking to the lower bound. When getting feedback on an early draft of this blog post, I ve been informed that Fred Brooks coined a term for what I call lower bound complexity Essential Complexity , in the paper No Silver Bullet Essence and Accident in Software Engineering , which is a better term and can be used interchangeably.

Complexity Culture In a large enough organization, where the team is high functioning enough to have and maintain trust amongst peers, members of the team will specialize. People will begin to engage with subsets of the work to be done, and begin to have their efficacy measured against that part of the organization s problems. Incentives shift, and over time it becomes increasingly likely that two engineers may have two very different priorities when working on the same system together. Someone accountable for uptime and tasked with responding to outages will begin to resist changes. Someone accountable for rapidly delivering features will resist gates between them and their users. Companies (either wittingly or unwittingly) will deal with this by tasking engineers with both production (feature development) and operational tasks (maintenance), so the difference in incentives isn t usually as bad as it could be. When we get a bunch of folks from far-flung corners of an organization in a room, fire up a slide deck and throw up some aspirational to-be architecture diagram in order to get a sign-off to solve some problem (be it someone needs a credible promotion packet, new feature needs to get delivered, or the system has begun to fail and needs fixing), the initial reaction will, more often than I d like, start to devolve into a discussion of how this is going to introduce a bunch of complexity, going to be hard to maintain, why can t you make it less complex? Right around here is when I start to try and contextualize the conversation happening around me understand what complexity is that being discussed, and understand who is taking on that burden. Think about who should be owning that problem, and work through the tradeoffs involved. Is it best solved here, or left to consumers (be them other systems, developers, or users). Should something become an API call s optional param, taking on all the edge-cases and on, or should users have to implement the logic using the data you return (leaving everyone else to take on all the edge-cases and maintenance)? Should you process the data, or require the user to preprocess it for you? Frequently it s right to make an active and explicit decision to simplify and leave problems to be solved downstream, since they may not actually need to be solved or perhaps you expect consumers will want to own the specifics of how the problem is solved, in which case you leave lots of documentation and examples. Many other times, especially when it s something downstream consumers are likely to hit, it s best solved internal to the system, since the only thing that can come of leaving it unsolved are bugs, frustration and half-correct solutions. This is a grey-space of tradeoffs, not a clear decision tree. No one wants the software manifestation of a katamari ball or a junk drawer, nor does anyone want a half-baked service unable to handle the simplest use-case.

Head-in-sand as a Service Popoffs about how complex something is, are, to a first approximation, best understood as meaning complicated for the person making comments . A lot of the #thoughtleadership believe that an AWS hosted EKS k8s cluster running images built by CI talking to an AWS hosted PostgreSQL RDS is not complex. They re right. Mostly right. This is less complex less complex for them. It s not, however, without complexity and its own tradeoffs it s just complexity that they do not have to deal with. Now they don t have to maintain machines that have pesky operating systems or hard drive failures. They don t have to deal with updating the version of k8s, nor ensuring the backups work. No one has to push some artifact to prod manually. Deployments happen unattended. You click a button and get a cluster. On the other hand, developers outside the ops function need to deal with troubleshooting CI, debugging access control rules encoded in turing complete YAML, permissions issues inside the cluster due to whatever the fuck a service mesh is, everyone needs to learn how to use some k8s tools they only actually use during a bad day, likely while doing some x.509 troubleshooting to connect to the cluster (an internal only endpoint; just port forward it) not to mention all sorts of rules to route packets to their project (a single repo s binary being run in 3 containers on a single vm host). Beyond that, there s the invisible complexity complexity on the interior of a service you depend on. I think about the dozens of teams maintaining the EKS service (which is either run on EC2 instances, or alternately, EC2 instances in a trench coat, moustache and even more shell scripts), the RDS service (also EC2 and shell scripts, but this time accounting for redundancy, backups, availability zones), scores of hypervisors pulled off the shelf (xen, kvm) smashed together with the ones built in-house (firecracker, nitro, etc) running on hardware that has to be refreshed and maintained continuously. Every request processed by network ACL rules, AWS IAM rules, security group rules, using IP space announced to the internet wired through IXPs directly into ISPs. I don t even want to begin to think about the complexity inherent in how those switches are designed. Shitloads of complexity to solve problems you may or may not have, or even know you had. What s more complex? An app running in an in-house 4u server racked in the office s telco closet in the back running off the office Verizon line, or an app running four hypervisors deep in an AWS datacenter? Which is more complex to you? What about to your organization? In total? Which is more prone to failure? Which is more secure? Is the complexity good or bad? What type of Complexity can you manage effectively? Which threaten the system? Which threaten your users?

COMPLEXIVIBES This extends beyond Engineering. Decisions regarding what tools are we able to use be them existing contracts with cloud providers, CIO mandated SaaS products, a list of the only permissible open source projects will incur costs in terms of expressed complexity . Pinning open source projects to a fixed set makes SBOM production less complex . Using only one SaaS provider s product suite (even if its terrible, because it has all the types of tools you need) makes accreditation less complex . If all you have is a contract with Pauly T s lowest price technically acceptable artisinal cloudary and haberdashery, the way you pay for your compute is less complex for the CIO shop, though you will find yourself building your own hosted database template, mechanism to spin up a k8s cluster, and all the operational and technical burden that comes with it. Or you won t and make it everyone else s problem in the organization. Nothing you can do will solve for the fact that you must now deal with this problem somewhere because it was less complicated for the business to put the workloads on the existing contract with a cut-rate vendor. Suddenly, the decision to reduce complexity because of an existing contract vehicle has resulted in a huge amount of technical risk and maintenance burden being onboarded. Complexity you would otherwise externalize has now been taken on internally. With a large enough organizations (specifically, in this case, i m talking about you, bureaucracies), this is largely ignored or accepted as normal since the personnel cost is understood to be free to everyone involved. Doing it this way is more expensive, more work, less reliable and less maintainable, and yet, somehow, is, in a lot of ways, less complex to the organization. It s particularly bad with bureaucracies, since screwing up a contract will get you into much more trouble than delivering a broken product, leaving basically no reason for anyone to care to fix this. I can t shake the feeling that for every story of technical mandates gone awry, somewhere just out of sight there s a decisionmaker optimizing for what they believe to be the least amount of complexity least hassle, fewest unique cases, most consistency as they can. They freely offload complexity from their accreditation and risk acceptance functions through mandates. They will never have to deal with it. That does not change the fact that someone does.

TC;DR (TOO COMPLEX; DIDN T REVIEW) We wish to rid ourselves of systemic Complexity after all, complexity is bad, simplicity is good. Removing upper-bound own-goal complexity ( accidental complexity in Brooks s terms) is important, but once you hit the lower bound complexity, the tradeoffs become zero-sum. Removing complexity from one part of the system means that somewhere else maybe outside your organization or in a non-engineering function must grow it back. Sometimes, the opposite is the case, such as when a previously manual business processes is automated. Maybe that s a good idea. Maybe it s not. All I know is that what doesn t help the situation is conflating complexity with everything we don t like legacy code, maintenance burden or toil, cost, delivery velocity.
  • Complexity is not the same as proclivity to failure. The most reliable systems I ve interacted with are unimaginably complex, with layers of internal protection to prevent complete failure. This has its own set of costs which other people have written about extensively.
  • Complexity is not cost. Sometimes the cost of taking all the complexity in-house is less, for whatever value of cost you choose to use.
  • Complexity is not absolute. Something simple from one perspective may be wildly complex from another. The impulse to burn down complex sections of code is helpful to have generally, but sometimes things are complicated for a reason, even if that reason exists outside your codebase or organization.
  • Complexity is not something you can remove without introducing complexity elsewhere. Just as not making a decision is a decision itself; choosing to require someone else to deal with a problem rather than dealing with it internally is a choice that needs to be considered in its full context.
Next time you re sitting through a discussion and someone starts to talk about all the complexity about to be introduced, I want to pop up in the back of your head, politely asking what does complex mean in this context? Is it lower bound complexity? Is this complexity desirable? Is what they re saying mean something along the lines of I don t understand the problems being solved, or does it mean something along the lines of this problem should be solved elsewhere? Do they believe this will result in more work for them in a way that you don t see? Should this not solved at all by changing the bounds of what we should accept or redefine the understood limits of this system? Is the perceived complexity a result of a decision elsewhere? Who s taking this complexity on, or more to the point, is failing to address complexity required by the problem leaving it to others? Does it impact others? How specifically? What are you not seeing? What can change? What should change?

21 October 2024

Sahil Dhiman: Free Software Mirrors in India

Last Updated on 15/11/2024. List of public mirrors in India. Location discovered basis personal knowledge, traces or GeoIP. Mirrors which aren t accessible outside their own ASN are excluded.

North India

East India

South India

West India

CDN (or behind one) Many thanks to Shrirang and Saswata for tips and corrections. Let me know if I m missing someone or something is amiss.

15 October 2024

Iustin Pop: Optical media lifetime - one data point

Way back (more than 10 years ago) when I was doing DVD-based backups, I knew that normal DVDs/Blu-Rays are no long-term archival solutions, and that if I was real about doing optical media backups, I need to switch to M-Disc. I actually bought a (small stack) of M-Disc Blu-Rays, but never used them. I then switched to other backups solutions, and forgot about the whole topic. Until, this week, while sorting stuff, I happened upon a set of DVD backups from a range of years, and was very curious whether they are still readable after many years. And, to my surprise, there were no surprises! Went backward in time, and: I also found stack of dual-layer DVD+R from 2012-2014, some for sure Verbatim, and some unmarked (they were intended to be printed on), but likely Verbatim as well. All worked just fine. Just that, even at ~8GiB per disk, backing up raw photo files took way too many disks, even in 2014 . At this point I was happy that all 12+ DVDs I found, ranging from 10 to 14 years, are all good. Then I found a batch of 3 CDs! Here the results were mixed: I think the takeaway is that for all explicitly selected media - TDK, JVC and Verbatim - they hold for 10-20 years. Valid reads from summer 2003 is mind boggling for me, for (IIRC) organic media - not sure about the TDK metallic substrate. And when you just pick whatever ( Creation ), well, the results are mixed. Note that in all this, it was about CDs and DVDs. I have no idea how Blu-Rays behave, since I don t think I ever wrote a Blu-Ray. In any case, surprising to me, and makes me rethink a bit my backup options. Sizes from 25 to 100GB Blu-Rays are reasonable for most critical data. And they re WORM, as opposed to most LTO media, which is re-writable (and to some small extent, prone to accidental wiping). Now, I should check those M-Disks to see if they can still be written to, after 10 years

26 September 2024

Melissa Wen: Reflections on 2024 Linux Display Next Hackfest

Hey everyone! The 2024 Linux Display Next hackfest concluded in May, and its outcomes continue to shape the Linux Display stack. Igalia hosted this year s event in A Coru a, Spain, bringing together leading experts in the field. Samuel Iglesias and I organized this year s edition and this blog post summarizes the experience and its fruits. One of the highlights of this year s hackfest was the wide range of backgrounds represented by our 40 participants (both on-site and remotely). Developers and experts from various companies and open-source projects came together to advance the Linux Display ecosystem. You can find the list of participants here. The event covered a broad spectrum of topics affecting the development of Linux projects, user experiences, and the future of display technologies on Linux. From cutting-edge topics to long-term discussions, you can check the event agenda here.

Organization Highlights The hackfest was marked by in-depth discussions and knowledge sharing among Linux contributors, making everyone inspired, informed, and connected to the community. Building on feedback from the previous year, we refined the unconference format to enhance participant preparation and engagement. Structured Agenda and Timeboxes: Each session had a defined scope, time limit (1h20 or 2h10), and began with an introductory talk on the topic.
  • Participant-Led Discussions: We pre-selected in-person participants to lead discussions, allowing them to prepare introductions, resources, and scope.
  • Transparent Scheduling: The schedule was shared in advance as GitHub issues, encouraging participants to review and prepare for sessions of interest.
Engaging Sessions: The hackfest featured a variety of topics, including presentations and discussions on how participants were addressing specific subjects within their companies.
  • No Breakout Rooms, No Overlaps: All participants chose to attend all sessions, eliminating the need for separate breakout rooms. We also adapted run-time schedule to keep everybody involved in the same topics.
  • Real-time Updates: We provided notifications and updates through dedicated emails and the event matrix room.
Strengthening Community Connections: The hackfest offered ample opportunities for networking among attendees.
  • Social Events: Igalia sponsored coffee breaks, lunches, and a dinner at a local restaurant.
  • Museum Visit: Participants enjoyed a sponsored visit to the Museum of Estrela Galicia Beer (MEGA).

Fruitful Discussions and Follow-up The structured agenda and breaks allowed us to cover multiple topics during the hackfest. These discussions have led to new display feature development and improvements, as evidenced by patches, merge requests, and implementations in project repositories and mailing lists. With the KMS color management API taking shape, we discussed refinements and best approaches to cover the variety of color pipeline from different hardware-vendors. We are also investigating techniques for a performant SDR<->HDR content reproduction and reducing latency and power consumption when using the color blocks of the hardware.

Color Management/HDR Color Management and HDR continued to be the hottest topic of the hackfest. We had three sessions dedicated to discuss Color and HDR across Linux Display stack layers.

Color/HDR (Kernel-Level) Harry Wentland (AMD) led this session. Here, kernel Developers shared the Color Management pipeline of AMD, Intel and NVidia. We counted with diagrams and explanations from HW-vendors developers that discussed differences, constraints and paths to fit them into the KMS generic color management properties such as advertising modeset needs, IN\_FORMAT, segmented LUTs, interpolation types, etc. Developers from Qualcomm and ARM also added information regarding their hardware. Upstream work related to this session:

Color/HDR (Compositor-Level) Sebastian Wick (RedHat) led this session. It started with Sebastian s presentation covering Wayland color protocols and compositor implementation. Also, an explanation of APIs provided by Wayland and how they can be used to achieve better color management for applications and discussions around ICC profiles and color representation metadata. There was also an intensive Q&A about LittleCMS with Marti Maria. Upstream work related to this session:

Color/HDR (Use Cases and Testing) Christopher Cameron (Google) and Melissa Wen (Igalia) led this session. In contrast to the other sessions, here we focused less on implementation and more on brainstorming and reflections of real-world SDR and HDR transformations (use and validation) and gainmaps. Christopher gave a nice presentation explaining HDR gainmap images and how we should think of HDR. This presentation and Q&A were important to put participants at the same page of how to transition between SDR and HDR and somehow emulating HDR. We also discussed on the usage of a kernel background color property. Finally, we discussed a bit about Chamelium and the future of VKMS (future work and maintainership).

Power Savings vs Color/Latency Mario Limonciello (AMD) led this session. Mario gave an introductory presentation about AMD ABM (adaptive backlight management) that is similar to Intel DPST. After some discussions, we agreed on exposing a kernel property for power saving policy. This work was already merged on kernel and the userspace support is under development. Upstream work related to this session:

Strategy for video and gaming use-cases Leo Li (AMD) led this session. Miguel Casas (Google) started this session with a presentation of Overlays in Chrome/OS Video, explaining the main goal of power saving by switching off GPU for accelerated compositing and the challenges of different colorspace/HDR for video on Linux. Then Leo Li presented different strategies for video and gaming and we discussed the userspace need of more detailed feedback mechanisms to understand failures when offloading. Also, creating a debugFS interface came up as a tool for debugging and analysis.

Real-time scheduling and async KMS API Xaver Hugl (KDE/BlueSystems) led this session. Compositor developers have exposed some issues with doing real-time scheduling and async page flips. One is that the Kernel limits the lifetime of realtime threads and if a modeset takes too long, the thread will be killed and thus the compositor as well. Also, simple page flips take longer than expected and drivers should optimize them. Another issue is the lack of feedback to compositors about hardware programming time and commit deadlines (the lastest possible time to commit). This is difficult to predict from drivers, since it varies greatly with the type of properties. For example, color management updates take much longer. In this regard, we discusssed implementing a hw_done callback to timestamp when the hardware programming of the last atomic commit is complete. Also an API to pre-program color pipeline in a kind of A/B scheme. It may not be supported by all drivers, but might be useful in different ways.

VRR/Frame Limit, Display Mux, Display Control, and more and beer We also had sessions to discuss a new KMS API to mitigate headaches on VRR and Frame Limit as different brightness level at different refresh rates, abrupt changes of refresh rates, low frame rate compensation (LFC) and precise timing in VRR more. On Display Control we discussed features missing in the current KMS interface for HDR mode, atomic backlight settings, source-based tone mapping, etc. We also discussed the need of a place where compositor developers can post TODOs to be developed by KMS people. The Content-adaptive Scaling and Sharpening session focused on sharpening and scaling filters. In the Display Mux session, we discussed proposals to expose the capability of dynamic mux switching display signal between discrete and integrated GPUs. In the last session of the 2024 Display Next Hackfest, participants representing different compositors summarized current and future work and built a Linux Display wish list , which includes: improvements to VTTY and HDR switching, better dmabuf API for multi-GPU support, definition of tone mapping, blending and scaling sematics, and wayland protocols for advertising to clients which colorspaces are supported. We closed this session with a status update on feature development by compositors, including but not limited to: plane offloading (from libcamera to output) / HDR video offloading (dma-heaps) / plane-based scrolling for web pages, color management / HDR / ICC profiles support, addressing issues such as flickering when color primaries don t match, etc. After three days of intensive discussions, all in-person participants went to a guided tour at the Museum of Extrela Galicia beer (MEGA), pouring and tasting the most famous local beer.

Feedback and Future Directions Participants provided valuable feedback on the hackfest, including suggestions for future improvements.
  • Schedule and Break-time Setup: Having a pre-defined agenda and schedule provided a better balance between long discussions and mental refreshments, preventing the fatigue caused by endless discussions.
  • Action Points: Some participants recommended explicitly asking for action points at the end of each session and assigning people to follow-up tasks.
  • Remote Participation: Remote attendees appreciated the inclusive setup and opportunities to actively participate in discussions.
  • Technical Challenges: There were bandwidth and video streaming issues during some sessions due to the large number of participants.

Thank you for joining the 2024 Display Next Hackfest We can t help but thank the 40 participants, who engaged in-person or virtually on relevant discussions, for a collaborative evolution of the Linux display stack and for building an insightful agenda. A big thank you to the leaders and presenters of the nine sessions: Christopher Cameron (Google), Harry Wentland (AMD), Leo Li (AMD), Mario Limoncello (AMD), Sebastian Wick (RedHat) and Xaver Hugl (KDE/BlueSystems) for the effort in preparing the sessions, explaining the topic and guiding discussions. My acknowledge to the others in-person participants that made such an effort to travel to A Coru a: Alex Goins (NVIDIA), David Turner (Raspberry Pi), Georges Stavracas (Igalia), Joan Torres (SUSE), Liviu Dudau (Arm), Louis Chauvet (Bootlin), Robert Mader (Collabora), Tian Mengge (GravityXR), Victor Jaquez (Igalia) and Victoria Brekenfeld (System76). It was and awesome opportunity to meet you and chat face-to-face. Finally, thanks virtual participants who couldn t make it in person but organized their days to actively participate in each discussion, adding different perspectives and valuable inputs even remotely: Abhinav Kumar (Qualcomm), Chaitanya Borah (Intel), Christopher Braga (Qualcomm), Dor Askayo (Red Hat), Jiri Koten (RedHat), Jonas dahl (Red Hat), Leandro Ribeiro (Collabora), Marti Maria (Little CMS), Marijn Suijten, Mario Kleiner, Martin Stransky (Red Hat), Michel D nzer (Red Hat), Miguel Casas-Sanchez (Google), Mitulkumar Golani (Intel), Naveen Kumar (Intel), Niels De Graef (Red Hat), Pekka Paalanen (Collabora), Pichika Uday Kiran (AMD), Shashank Sharma (AMD), Sriharsha PV (AMD), Simon Ser, Uma Shankar (Intel) and Vikas Korjani (AMD). We look forward to another successful Display Next hackfest, continuing to drive innovation and improvement in the Linux display ecosystem!

25 September 2024

Melissa Wen: Reflections on 2024 Linux Display Next Hackfest

Hey everyone! The 2024 Linux Display Next hackfest concluded in May, and its outcomes continue to shape the Linux Display stack. Igalia hosted this year s event in A Coru a, Spain, bringing together leading experts in the field. Samuel Iglesias and I organized this year s edition and this blog post summarizes the experience and its fruits. One of the highlights of this year s hackfest was the wide range of backgrounds represented by our 40 participants (both on-site and remotely). Developers and experts from various companies and open-source projects came together to advance the Linux Display ecosystem. You can find the list of participants here. The event covered a broad spectrum of topics affecting the development of Linux projects, user experiences, and the future of display technologies on Linux. From cutting-edge topics to long-term discussions, you can check the event agenda here.

Organization Highlights The hackfest was marked by in-depth discussions and knowledge sharing among Linux contributors, making everyone inspired, informed, and connected to the community. Building on feedback from the previous year, we refined the unconference format to enhance participant preparation and engagement. Structured Agenda and Timeboxes: Each session had a defined scope, time limit (1h20 or 2h10), and began with an introductory talk on the topic.
  • Participant-Led Discussions: We pre-selected in-person participants to lead discussions, allowing them to prepare introductions, resources, and scope.
  • Transparent Scheduling: The schedule was shared in advance as GitHub issues, encouraging participants to review and prepare for sessions of interest.
Engaging Sessions: The hackfest featured a variety of topics, including presentations and discussions on how participants were addressing specific subjects within their companies.
  • No Breakout Rooms, No Overlaps: All participants chose to attend all sessions, eliminating the need for separate breakout rooms. We also adapted run-time schedule to keep everybody involved in the same topics.
  • Real-time Updates: We provided notifications and updates through dedicated emails and the event matrix room.
Strengthening Community Connections: The hackfest offered ample opportunities for networking among attendees.
  • Social Events: Igalia sponsored coffee breaks, lunches, and a dinner at a local restaurant.
  • Museum Visit: Participants enjoyed a sponsored visit to the Museum of Estrela Galicia Beer (MEGA).

Fruitful Discussions and Follow-up The structured agenda and breaks allowed us to cover multiple topics during the hackfest. These discussions have led to new display feature development and improvements, as evidenced by patches, merge requests, and implementations in project repositories and mailing lists. With the KMS color management API taking shape, we discussed refinements and best approaches to cover the variety of color pipeline from different hardware-vendors. We are also investigating techniques for a performant SDR<->HDR content reproduction and reducing latency and power consumption when using the color blocks of the hardware.

Color Management/HDR Color Management and HDR continued to be the hottest topic of the hackfest. We had three sessions dedicated to discuss Color and HDR across Linux Display stack layers.

Color/HDR (Kernel-Level) Harry Wentland (AMD) led this session. Here, kernel Developers shared the Color Management pipeline of AMD, Intel and NVidia. We counted with diagrams and explanations from HW-vendors developers that discussed differences, constraints and paths to fit them into the KMS generic color management properties such as advertising modeset needs, IN\_FORMAT, segmented LUTs, interpolation types, etc. Developers from Qualcomm and ARM also added information regarding their hardware. Upstream work related to this session:

Color/HDR (Compositor-Level) Sebastian Wick (RedHat) led this session. It started with Sebastian s presentation covering Wayland color protocols and compositor implementation. Also, an explanation of APIs provided by Wayland and how they can be used to achieve better color management for applications and discussions around ICC profiles and color representation metadata. There was also an intensive Q&A about LittleCMS with Marti Maria. Upstream work related to this session:

Color/HDR (Use Cases and Testing) Christopher Cameron (Google) and Melissa Wen (Igalia) led this session. In contrast to the other sessions, here we focused less on implementation and more on brainstorming and reflections of real-world SDR and HDR transformations (use and validation) and gainmaps. Christopher gave a nice presentation explaining HDR gainmap images and how we should think of HDR. This presentation and Q&A were important to put participants at the same page of how to transition between SDR and HDR and somehow emulating HDR. We also discussed on the usage of a kernel background color property. Finally, we discussed a bit about Chamelium and the future of VKMS (future work and maintainership).

Power Savings vs Color/Latency Mario Limonciello (AMD) led this session. Mario gave an introductory presentation about AMD ABM (adaptive backlight management) that is similar to Intel DPST. After some discussions, we agreed on exposing a kernel property for power saving policy. This work was already merged on kernel and the userspace support is under development. Upstream work related to this session:

Strategy for video and gaming use-cases Leo Li (AMD) led this session. Miguel Casas (Google) started this session with a presentation of Overlays in Chrome/OS Video, explaining the main goal of power saving by switching off GPU for accelerated compositing and the challenges of different colorspace/HDR for video on Linux. Then Leo Li presented different strategies for video and gaming and we discussed the userspace need of more detailed feedback mechanisms to understand failures when offloading. Also, creating a debugFS interface came up as a tool for debugging and analysis.

Real-time scheduling and async KMS API Xaver Hugl (KDE/BlueSystems) led this session. Compositor developers have exposed some issues with doing real-time scheduling and async page flips. One is that the Kernel limits the lifetime of realtime threads and if a modeset takes too long, the thread will be killed and thus the compositor as well. Also, simple page flips take longer than expected and drivers should optimize them. Another issue is the lack of feedback to compositors about hardware programming time and commit deadlines (the lastest possible time to commit). This is difficult to predict from drivers, since it varies greatly with the type of properties. For example, color management updates take much longer. In this regard, we discusssed implementing a hw_done callback to timestamp when the hardware programming of the last atomic commit is complete. Also an API to pre-program color pipeline in a kind of A/B scheme. It may not be supported by all drivers, but might be useful in different ways.

VRR/Frame Limit, Display Mux, Display Control, and more and beer We also had sessions to discuss a new KMS API to mitigate headaches on VRR and Frame Limit as different brightness level at different refresh rates, abrupt changes of refresh rates, low frame rate compensation (LFC) and precise timing in VRR more. On Display Control we discussed features missing in the current KMS interface for HDR mode, atomic backlight settings, source-based tone mapping, etc. We also discussed the need of a place where compositor developers can post TODOs to be developed by KMS people. The Content-adaptive Scaling and Sharpening session focused on sharpening and scaling filters. In the Display Mux session, we discussed proposals to expose the capability of dynamic mux switching display signal between discrete and integrated GPUs. In the last session of the 2024 Display Next Hackfest, participants representing different compositors summarized current and future work and built a Linux Display wish list , which includes: improvements to VTTY and HDR switching, better dmabuf API for multi-GPU support, definition of tone mapping, blending and scaling sematics, and wayland protocols for advertising to clients which colorspaces are supported. We closed this session with a status update on feature development by compositors, including but not limited to: plane offloading (from libcamera to output) / HDR video offloading (dma-heaps) / plane-based scrolling for web pages, color management / HDR / ICC profiles support, addressing issues such as flickering when color primaries don t match, etc. After three days of intensive discussions, all in-person participants went to a guided tour at the Museum of Extrela Galicia beer (MEGA), pouring and tasting the most famous local beer.

Feedback and Future Directions Participants provided valuable feedback on the hackfest, including suggestions for future improvements.
  • Schedule and Break-time Setup: Having a pre-defined agenda and schedule provided a better balance between long discussions and mental refreshments, preventing the fatigue caused by endless discussions.
  • Action Points: Some participants recommended explicitly asking for action points at the end of each session and assigning people to follow-up tasks.
  • Remote Participation: Remote attendees appreciated the inclusive setup and opportunities to actively participate in discussions.
  • Technical Challenges: There were bandwidth and video streaming issues during some sessions due to the large number of participants.

Thank you for joining the 2024 Display Next Hackfest We can t help but thank the 40 participants, who engaged in-person or virtually on relevant discussions, for a collaborative evolution of the Linux display stack and for building an insightful agenda. A big thank you to the leaders and presenters of the nine sessions: Christopher Cameron (Google), Harry Wentland (AMD), Leo Li (AMD), Mario Limoncello (AMD), Sebastian Wick (RedHat) and Xaver Hugl (KDE/BlueSystems) for the effort in preparing the sessions, explaining the topic and guiding discussions. My acknowledge to the others in-person participants that made such an effort to travel to A Coru a: Alex Goins (NVIDIA), David Turner (Raspberry Pi), Georges Stavracas (Igalia), Joan Torres (SUSE), Liviu Dudau (Arm), Louis Chauvet (Bootlin), Robert Mader (Collabora), Tian Mengge (GravityXR), Victor Jaquez (Igalia) and Victoria Brekenfeld (System76). It was and awesome opportunity to meet you and chat face-to-face. Finally, thanks virtual participants who couldn t make it in person but organized their days to actively participate in each discussion, adding different perspectives and valuable inputs even remotely: Abhinav Kumar (Qualcomm), Chaitanya Borah (Intel), Christopher Braga (Qualcomm), Dor Askayo, Jiri Koten (RedHat), Jonas dahl (Red Hat), Leandro Ribeiro (Collabora), Marti Maria (Little CMS), Marijn Suijten, Mario Kleiner, Martin Stransky (Red Hat), Michel D nzer (Red Hat), Miguel Casas-Sanchez (Google), Mitulkumar Golani (Intel), Naveen Kumar (Intel), Niels De Graef (Red Hat), Pekka Paalanen (Collabora), Pichika Uday Kiran (AMD), Shashank Sharma (AMD), Sriharsha PV (AMD), Simon Ser, Uma Shankar (Intel) and Vikas Korjani (AMD). We look forward to another successful Display Next hackfest, continuing to drive innovation and improvement in the Linux display ecosystem!

8 September 2024

Jacob Adams: Linux's Bedtime Routine

How does Linux move from an awake machine to a hibernating one? How does it then manage to restore all state? These questions led me to read way too much C in trying to figure out how this particular hardware/software boundary is navigated. This investigation will be split into a few parts, with the first one going from invocation of hibernation to synchronizing all filesystems to disk. This article has been written using Linux version 6.9.9, the source of which can be found in many places, but can be navigated easily through the Bootlin Elixir Cross-Referencer: https://elixir.bootlin.com/linux/v6.9.9/source Each code snippet will begin with a link to the above giving the file path and the line number of the beginning of the snippet.

A Starting Point for Investigation: /sys/power/state and /sys/power/disk These two system files exist to allow debugging of hibernation, and thus control the exact state used directly. Writing specific values to the state file controls the exact sleep mode used and disk controls the specific hibernation mode1. This is extremely handy as an entry point to understand how these systems work, since we can just follow what happens when they are written to.

Show and Store Functions These two files are defined using the power_attr macro: kernel/power/power.h:80
#define power_attr(_name) \
static struct kobj_attribute _name##_attr =     \
    .attr   =               \
        .name = __stringify(_name), \
        .mode = 0644,           \
     ,                  \
    .show   = _name##_show,         \
    .store  = _name##_store,        \
 
show is called on reads and store on writes. state_show is a little boring for our purposes, as it just prints all the available sleep states. kernel/power/main.c:657
/*
 * state - control system sleep states.
 *
 * show() returns available sleep state labels, which may be "mem", "standby",
 * "freeze" and "disk" (hibernation).
 * See Documentation/admin-guide/pm/sleep-states.rst for a description of
 * what they mean.
 *
 * store() accepts one of those strings, translates it into the proper
 * enumerated value, and initiates a suspend transition.
 */
static ssize_t state_show(struct kobject *kobj, struct kobj_attribute *attr,
			  char *buf)
 
	char *s = buf;
#ifdef CONFIG_SUSPEND
	suspend_state_t i;
	for (i = PM_SUSPEND_MIN; i < PM_SUSPEND_MAX; i++)
		if (pm_states[i])
			s += sprintf(s,"%s ", pm_states[i]);
#endif
	if (hibernation_available())
		s += sprintf(s, "disk ");
	if (s != buf)
		/* convert the last space to a newline */
		*(s-1) = '\n';
	return (s - buf);
 
state_store, however, provides our entry point. If the string disk is written to the state file, it calls hibernate(). This is our entry point. kernel/power/main.c:715
static ssize_t state_store(struct kobject *kobj, struct kobj_attribute *attr,
			   const char *buf, size_t n)
 
	suspend_state_t state;
	int error;
	error = pm_autosleep_lock();
	if (error)
		return error;
	if (pm_autosleep_state() > PM_SUSPEND_ON)  
		error = -EBUSY;
		goto out;
	 
	state = decode_state(buf, n);
	if (state < PM_SUSPEND_MAX)  
		if (state == PM_SUSPEND_MEM)
			state = mem_sleep_current;
		error = pm_suspend(state);
	  else if (state == PM_SUSPEND_MAX)  
		error = hibernate();
	  else  
		error = -EINVAL;
	 
 out:
	pm_autosleep_unlock();
	return error ? error : n;
 
kernel/power/main.c:688
static suspend_state_t decode_state(const char *buf, size_t n)
 
#ifdef CONFIG_SUSPEND
	suspend_state_t state;
#endif
	char *p;
	int len;
	p = memchr(buf, '\n', n);
	len = p ? p - buf : n;
	/* Check hibernation first. */
	if (len == 4 && str_has_prefix(buf, "disk"))
		return PM_SUSPEND_MAX;
#ifdef CONFIG_SUSPEND
	for (state = PM_SUSPEND_MIN; state < PM_SUSPEND_MAX; state++)  
		const char *label = pm_states[state];
		if (label && len == strlen(label) && !strncmp(buf, label, len))
			return state;
	 
#endif
	return PM_SUSPEND_ON;
 
Could we have figured this out just via function names? Sure, but this way we know for sure that nothing else is happening before this function is called.

Autosleep Our first detour is into the autosleep system. When checking the state above, you may notice that the kernel grabs the pm_autosleep_lock before checking the current state. autosleep is a mechanism originally from Android that sends the entire system to either suspend or hibernate whenever it is not actively working on anything. This is not enabled for most desktop configurations, since it s primarily for mobile systems and inverts the standard suspend and hibernate interactions. This system is implemented as a workqueue2 that checks the current number of wakeup events, processes and drivers that need to run3, and if there aren t any, then the system is put into the autosleep state, typically suspend. However, it could be hibernate if configured that way via /sys/power/autosleep in a similar manner to using /sys/power/state to manually enable hibernation. kernel/power/main.c:841
static ssize_t autosleep_store(struct kobject *kobj,
			       struct kobj_attribute *attr,
			       const char *buf, size_t n)
 
	suspend_state_t state = decode_state(buf, n);
	int error;
	if (state == PM_SUSPEND_ON
	    && strcmp(buf, "off") && strcmp(buf, "off\n"))
		return -EINVAL;
	if (state == PM_SUSPEND_MEM)
		state = mem_sleep_current;
	error = pm_autosleep_set_state(state);
	return error ? error : n;
 
power_attr(autosleep);
#endif /* CONFIG_PM_AUTOSLEEP */
kernel/power/autosleep.c:24
static DEFINE_MUTEX(autosleep_lock);
static struct wakeup_source *autosleep_ws;
static void try_to_suspend(struct work_struct *work)
 
	unsigned int initial_count, final_count;
	if (!pm_get_wakeup_count(&initial_count, true))
		goto out;
	mutex_lock(&autosleep_lock);
	if (!pm_save_wakeup_count(initial_count)  
		system_state != SYSTEM_RUNNING)  
		mutex_unlock(&autosleep_lock);
		goto out;
	 
	if (autosleep_state == PM_SUSPEND_ON)  
		mutex_unlock(&autosleep_lock);
		return;
	 
	if (autosleep_state >= PM_SUSPEND_MAX)
		hibernate();
	else
		pm_suspend(autosleep_state);
	mutex_unlock(&autosleep_lock);
	if (!pm_get_wakeup_count(&final_count, false))
		goto out;
	/*
	 * If the wakeup occurred for an unknown reason, wait to prevent the
	 * system from trying to suspend and waking up in a tight loop.
	 */
	if (final_count == initial_count)
		schedule_timeout_uninterruptible(HZ / 2);
 out:
	queue_up_suspend_work();
 
static DECLARE_WORK(suspend_work, try_to_suspend);
void queue_up_suspend_work(void)
 
	if (autosleep_state > PM_SUSPEND_ON)
		queue_work(autosleep_wq, &suspend_work);
 

The Steps of Hibernation

Hibernation Kernel Config It s important to note that most of the hibernate-specific functions below do nothing unless you ve defined CONFIG_HIBERNATION in your Kconfig4. As an example, hibernate itself is defined as the following if CONFIG_HIBERNATE is not set. include/linux/suspend.h:407
static inline int hibernate(void)   return -ENOSYS;  

Check if Hibernation is Available We begin by confirming that we actually can perform hibernation, via the hibernation_available function. kernel/power/hibernate.c:742
if (!hibernation_available())  
	pm_pr_dbg("Hibernation not available.\n");
	return -EPERM;
 
kernel/power/hibernate.c:92
bool hibernation_available(void)
 
	return nohibernate == 0 &&
		!security_locked_down(LOCKDOWN_HIBERNATION) &&
		!secretmem_active() && !cxl_mem_active();
 
nohibernate is controlled by the kernel command line, it s set via either nohibernate or hibernate=no. security_locked_down is a hook for Linux Security Modules to prevent hibernation. This is used to prevent hibernating to an unencrypted storage device, as specified in the manual page kernel_lockdown(7). Interestingly, either level of lockdown, integrity or confidentiality, locks down hibernation because with the ability to hibernate you can extract bascially anything from memory and even reboot into a modified kernel image. secretmem_active checks whether there is any active use of memfd_secret, and if so it prevents hibernation. memfd_secret returns a file descriptor that can be mapped into a process but is specifically unmapped from the kernel s memory space. Hibernating with memory that not even the kernel is supposed to access would expose that memory to whoever could access the hibernation image. This particular feature of secret memory was apparently controversial, though not as controversial as performance concerns around fragmentation when unmapping kernel memory (which did not end up being a real problem). cxl_mem_active just checks whether any CXL memory is active. A full explanation is provided in the commit introducing this check but there s also a shortened explanation from cxl_mem_probe that sets the relevant flag when initializing a CXL memory device. drivers/cxl/mem.c:186
* The kernel may be operating out of CXL memory on this device,
* there is no spec defined way to determine whether this device
* preserves contents over suspend, and there is no simple way
* to arrange for the suspend image to avoid CXL memory which
* would setup a circular dependency between PCI resume and save
* state restoration.

Check Compression The next check is for whether compression support is enabled, and if so whether the requested algorithm is enabled. kernel/power/hibernate.c:747
/*
 * Query for the compression algorithm support if compression is enabled.
 */
if (!nocompress)  
	strscpy(hib_comp_algo, hibernate_compressor, sizeof(hib_comp_algo));
	if (crypto_has_comp(hib_comp_algo, 0, 0) != 1)  
		pr_err("%s compression is not available\n", hib_comp_algo);
		return -EOPNOTSUPP;
	 
 
The nocompress flag is set via the hibernate command line parameter, setting hibernate=nocompress. If compression is enabled, then hibernate_compressor is copied to hib_comp_algo. This synchronizes the current requested compression setting (hibernate_compressor) with the current compression setting (hib_comp_algo). Both values are character arrays of size CRYPTO_MAX_ALG_NAME (128 in this kernel). kernel/power/hibernate.c:50
static char hibernate_compressor[CRYPTO_MAX_ALG_NAME] = CONFIG_HIBERNATION_DEF_COMP;
/*
 * Compression/decompression algorithm to be used while saving/loading
 * image to/from disk. This would later be used in 'kernel/power/swap.c'
 * to allocate comp streams.
 */
char hib_comp_algo[CRYPTO_MAX_ALG_NAME];
hibernate_compressor defaults to lzo if that algorithm is enabled, otherwise to lz4 if enabled5. It can be overwritten using the hibernate.compressor setting to either lzo or lz4. kernel/power/Kconfig:95
choice
	prompt "Default compressor"
	default HIBERNATION_COMP_LZO
	depends on HIBERNATION
config HIBERNATION_COMP_LZO
	bool "lzo"
	depends on CRYPTO_LZO
config HIBERNATION_COMP_LZ4
	bool "lz4"
	depends on CRYPTO_LZ4
endchoice
config HIBERNATION_DEF_COMP
	string
	default "lzo" if HIBERNATION_COMP_LZO
	default "lz4" if HIBERNATION_COMP_LZ4
	help
	  Default compressor to be used for hibernation.
kernel/power/hibernate.c:1425
static const char * const comp_alg_enabled[] =  
#if IS_ENABLED(CONFIG_CRYPTO_LZO)
	COMPRESSION_ALGO_LZO,
#endif
#if IS_ENABLED(CONFIG_CRYPTO_LZ4)
	COMPRESSION_ALGO_LZ4,
#endif
 ;
static int hibernate_compressor_param_set(const char *compressor,
		const struct kernel_param *kp)
 
	unsigned int sleep_flags;
	int index, ret;
	sleep_flags = lock_system_sleep();
	index = sysfs_match_string(comp_alg_enabled, compressor);
	if (index >= 0)  
		ret = param_set_copystring(comp_alg_enabled[index], kp);
		if (!ret)
			strscpy(hib_comp_algo, comp_alg_enabled[index],
				sizeof(hib_comp_algo));
	  else  
		ret = index;
	 
	unlock_system_sleep(sleep_flags);
	if (ret)
		pr_debug("Cannot set specified compressor %s\n",
			 compressor);
	return ret;
 
static const struct kernel_param_ops hibernate_compressor_param_ops =  
	.set    = hibernate_compressor_param_set,
	.get    = param_get_string,
 ;
static struct kparam_string hibernate_compressor_param_string =  
	.maxlen = sizeof(hibernate_compressor),
	.string = hibernate_compressor,
 ;
We then check whether the requested algorithm is supported via crypto_has_comp. If not, we bail out of the whole operation with EOPNOTSUPP. As part of crypto_has_comp we perform any needed initialization of the algorithm, loading kernel modules and running initialization code as needed6.

Grab Locks The next step is to grab the sleep and hibernation locks via lock_system_sleep and hibernate_acquire. kernel/power/hibernate.c:758
sleep_flags = lock_system_sleep();
/* The snapshot device should not be opened while we're running */
if (!hibernate_acquire())  
	error = -EBUSY;
	goto Unlock;
 
First, lock_system_sleep marks the current thread as not freezable, which will be important later7. It then grabs the system_transistion_mutex, which locks taking snapshots or modifying how they are taken, resuming from a hibernation image, entering any suspend state, or rebooting.

The GFP Mask The kernel also issues a warning if the gfp mask is changed via either pm_restore_gfp_mask or pm_restrict_gfp_mask without holding the system_transistion_mutex. GFP flags tell the kernel how it is permitted to handle a request for memory. include/linux/gfp_types.h:12
 * GFP flags are commonly used throughout Linux to indicate how memory
 * should be allocated.  The GFP acronym stands for get_free_pages(),
 * the underlying memory allocation function.  Not every GFP flag is
 * supported by every function which may allocate memory.
In the case of hibernation specifically we care about the IO and FS flags, which are reclaim operators, ways the system is permitted to attempt to free up memory in order to satisfy a specific request for memory. include/linux/gfp_types.h:176
 * Reclaim modifiers
 * -----------------
 * Please note that all the following flags are only applicable to sleepable
 * allocations (e.g. %GFP_NOWAIT and %GFP_ATOMIC will ignore them).
 *
 * %__GFP_IO can start physical IO.
 *
 * %__GFP_FS can call down to the low-level FS. Clearing the flag avoids the
 * allocator recursing into the filesystem which might already be holding
 * locks.
gfp_allowed_mask sets which flags are permitted to be set at the current time. As the comment below outlines, preventing these flags from being set avoids situations where the kernel needs to do I/O to allocate memory (e.g. read/writing swap8) but the devices it needs to read/write to/from are not currently available. kernel/power/main.c:24
/*
 * The following functions are used by the suspend/hibernate code to temporarily
 * change gfp_allowed_mask in order to avoid using I/O during memory allocations
 * while devices are suspended.  To avoid races with the suspend/hibernate code,
 * they should always be called with system_transition_mutex held
 * (gfp_allowed_mask also should only be modified with system_transition_mutex
 * held, unless the suspend/hibernate code is guaranteed not to run in parallel
 * with that modification).
 */
static gfp_t saved_gfp_mask;
void pm_restore_gfp_mask(void)
 
	WARN_ON(!mutex_is_locked(&system_transition_mutex));
	if (saved_gfp_mask)  
		gfp_allowed_mask = saved_gfp_mask;
		saved_gfp_mask = 0;
	 
 
void pm_restrict_gfp_mask(void)
 
	WARN_ON(!mutex_is_locked(&system_transition_mutex));
	WARN_ON(saved_gfp_mask);
	saved_gfp_mask = gfp_allowed_mask;
	gfp_allowed_mask &= ~(__GFP_IO   __GFP_FS);
 

Sleep Flags After grabbing the system_transition_mutex the kernel then returns and captures the previous state of the threads flags in sleep_flags. This is used later to remove PF_NOFREEZE if it wasn t previously set on the current thread. kernel/power/main.c:52
unsigned int lock_system_sleep(void)
 
	unsigned int flags = current->flags;
	current->flags  = PF_NOFREEZE;
	mutex_lock(&system_transition_mutex);
	return flags;
 
EXPORT_SYMBOL_GPL(lock_system_sleep);
include/linux/sched.h:1633
#define PF_NOFREEZE		0x00008000	/* This thread should not be frozen */
Then we grab the hibernate-specific semaphore to ensure no one can open a snapshot or resume from it while we perform hibernation. Additionally this lock is used to prevent hibernate_quiet_exec, which is used by the nvdimm driver to active its firmware with all processes and devices frozen, ensuring it is the only thing running at that time9. kernel/power/hibernate.c:82
bool hibernate_acquire(void)
 
	return atomic_add_unless(&hibernate_atomic, -1, 0);
 

Prepare Console The kernel next calls pm_prepare_console. This function only does anything if CONFIG_VT_CONSOLE_SLEEP has been set. This prepares the virtual terminal for a suspend state, switching away to a console used only for the suspend state if needed. kernel/power/console.c:130
void pm_prepare_console(void)
 
	if (!pm_vt_switch())
		return;
	orig_fgconsole = vt_move_to_console(SUSPEND_CONSOLE, 1);
	if (orig_fgconsole < 0)
		return;
	orig_kmsg = vt_kmsg_redirect(SUSPEND_CONSOLE);
	return;
 
The first thing is to check whether we actually need to switch the VT kernel/power/console.c:94
/*
 * There are three cases when a VT switch on suspend/resume are required:
 *   1) no driver has indicated a requirement one way or another, so preserve
 *      the old behavior
 *   2) console suspend is disabled, we want to see debug messages across
 *      suspend/resume
 *   3) any registered driver indicates it needs a VT switch
 *
 * If none of these conditions is present, meaning we have at least one driver
 * that doesn't need the switch, and none that do, we can avoid it to make
 * resume look a little prettier (and suspend too, but that's usually hidden,
 * e.g. when closing the lid on a laptop).
 */
static bool pm_vt_switch(void)
 
	struct pm_vt_switch *entry;
	bool ret = true;
	mutex_lock(&vt_switch_mutex);
	if (list_empty(&pm_vt_switch_list))
		goto out;
	if (!console_suspend_enabled)
		goto out;
	list_for_each_entry(entry, &pm_vt_switch_list, head)  
		if (entry->required)
			goto out;
	 
	ret = false;
out:
	mutex_unlock(&vt_switch_mutex);
	return ret;
 
There is an explanation of the conditions under which a switch is performed in the comment above the function, but we ll also walk through the steps here. Firstly we grab the vt_switch_mutex to ensure nothing will modify the list while we re looking at it. We then examine the pm_vt_switch_list. This list is used to indicate the drivers that require a switch during suspend. They register this requirement, or the lack thereof, via pm_vt_switch_required. kernel/power/console.c:31
/**
 * pm_vt_switch_required - indicate VT switch at suspend requirements
 * @dev: device
 * @required: if true, caller needs VT switch at suspend/resume time
 *
 * The different console drivers may or may not require VT switches across
 * suspend/resume, depending on how they handle restoring video state and
 * what may be running.
 *
 * Drivers can indicate support for switchless suspend/resume, which can
 * save time and flicker, by using this routine and passing 'false' as
 * the argument.  If any loaded driver needs VT switching, or the
 * no_console_suspend argument has been passed on the command line, VT
 * switches will occur.
 */
void pm_vt_switch_required(struct device *dev, bool required)
Next, we check console_suspend_enabled. This is set to false by the kernel parameter no_console_suspend, but defaults to true. Finally, if there are any entries in the pm_vt_switch_list, then we check to see if any of them require a VT switch. Only if none of these conditions apply, then we return false. If a VT switch is in fact required, then we move first the currently active virtual terminal/console10 (vt_move_to_console) and then the current location of kernel messages (vt_kmsg_redirect) to the SUSPEND_CONSOLE. The SUSPEND_CONSOLE is the last entry in the list of possible consoles, and appears to just be a black hole to throw away messages. kernel/power/console.c:16
#define SUSPEND_CONSOLE	(MAX_NR_CONSOLES-1)
Interestingly, these are separate functions because you can use TIOCL_SETKMSGREDIRECT (an ioctl11) to send kernel messages to a specific virtual terminal, but by default its the same as the currently active console. The locations of the previously active console and the previous kernel messages location are stored in orig_fgconsole and orig_kmsg, to restore the state of the console and kernel messages after the machine wakes up again. Interestingly, this means orig_fgconsole also ends up storing any errors, so has to be checked to ensure it s not less than zero before we try to do anything with the kernel messages on both suspend and resume. drivers/tty/vt/vt_ioctl.c:1268
/* Perform a kernel triggered VT switch for suspend/resume */
static int disable_vt_switch;
int vt_move_to_console(unsigned int vt, int alloc)
 
	int prev;
	console_lock();
	/* Graphics mode - up to X */
	if (disable_vt_switch)  
		console_unlock();
		return 0;
	 
	prev = fg_console;
	if (alloc && vc_allocate(vt))  
		/* we can't have a free VC for now. Too bad,
		 * we don't want to mess the screen for now. */
		console_unlock();
		return -ENOSPC;
	 
	if (set_console(vt))  
		/*
		 * We're unable to switch to the SUSPEND_CONSOLE.
		 * Let the calling function know so it can decide
		 * what to do.
		 */
		console_unlock();
		return -EIO;
	 
	console_unlock();
	if (vt_waitactive(vt + 1))  
		pr_debug("Suspend: Can't switch VCs.");
		return -EINTR;
	 
	return prev;
 
Unlike most other locking functions we ve seen so far, console_lock needs to be careful to ensure nothing else is panicking and needs to dump to the console before grabbing the semaphore for the console and setting a couple flags.

Panics Panics are tracked via an atomic integer set to the id of the processor currently panicking. kernel/printk/printk.c:2649
/**
 * console_lock - block the console subsystem from printing
 *
 * Acquires a lock which guarantees that no consoles will
 * be in or enter their write() callback.
 *
 * Can sleep, returns nothing.
 */
void console_lock(void)
 
	might_sleep();
	/* On panic, the console_lock must be left to the panic cpu. */
	while (other_cpu_in_panic())
		msleep(1000);
	down_console_sem();
	console_locked = 1;
	console_may_schedule = 1;
 
EXPORT_SYMBOL(console_lock);
kernel/printk/printk.c:362
/*
 * Return true if a panic is in progress on a remote CPU.
 *
 * On true, the local CPU should immediately release any printing resources
 * that may be needed by the panic CPU.
 */
bool other_cpu_in_panic(void)
 
	return (panic_in_progress() && !this_cpu_in_panic());
 
kernel/printk/printk.c:345
static bool panic_in_progress(void)
 
	return unlikely(atomic_read(&panic_cpu) != PANIC_CPU_INVALID);
 
kernel/printk/printk.c:350
/* Return true if a panic is in progress on the current CPU. */
bool this_cpu_in_panic(void)
 
	/*
	 * We can use raw_smp_processor_id() here because it is impossible for
	 * the task to be migrated to the panic_cpu, or away from it. If
	 * panic_cpu has already been set, and we're not currently executing on
	 * that CPU, then we never will be.
	 */
	return unlikely(atomic_read(&panic_cpu) == raw_smp_processor_id());
 
console_locked is a debug value, used to indicate that the lock should be held, and our first indication that this whole virtual terminal system is more complex than might initially be expected. kernel/printk/printk.c:373
/*
 * This is used for debugging the mess that is the VT code by
 * keeping track if we have the console semaphore held. It's
 * definitely not the perfect debug tool (we don't know if _WE_
 * hold it and are racing, but it helps tracking those weird code
 * paths in the console code where we end up in places I want
 * locked without the console semaphore held).
 */
static int console_locked;
console_may_schedule is used to see if we are permitted to sleep and schedule other work while we hold this lock. As we ll see later, the virtual terminal subsystem is not re-entrant, so there s all sorts of hacks in here to ensure we don t leave important code sections that can t be safely resumed.

Disable VT Switch As the comment below lays out, when another program is handling graphical display anyway, there s no need to do any of this, so the kernel provides a switch to turn the whole thing off. Interestingly, this appears to only be used by three drivers, so the specific hardware support required must not be particularly common.
drivers/gpu/drm/omapdrm/dss
drivers/video/fbdev/geode
drivers/video/fbdev/omap2
drivers/tty/vt/vt_ioctl.c:1308
/*
 * Normally during a suspend, we allocate a new console and switch to it.
 * When we resume, we switch back to the original console.  This switch
 * can be slow, so on systems where the framebuffer can handle restoration
 * of video registers anyways, there's little point in doing the console
 * switch.  This function allows you to disable it by passing it '0'.
 */
void pm_set_vt_switch(int do_switch)
 
	console_lock();
	disable_vt_switch = !do_switch;
	console_unlock();
 
EXPORT_SYMBOL(pm_set_vt_switch);
The rest of the vt_switch_console function is pretty normal, however, simply allocating space if needed to create the requested virtual terminal and then setting the current virtual terminal via set_console.

Virtual Terminal Set Console With set_console, we begin (as if we haven t been already) to enter the madness that is the virtual terminal subsystem. As mentioned previously, modifications to its state must be made very carefully, as other stuff happening at the same time could create complete messes. All this to say, calling set_console does not actually perform any work to change the state of the current console. Instead it indicates what changes it wants and then schedules that work. drivers/tty/vt/vt.c:3153
int set_console(int nr)
 
	struct vc_data *vc = vc_cons[fg_console].d;
	if (!vc_cons_allocated(nr)   vt_dont_switch  
		(vc->vt_mode.mode == VT_AUTO && vc->vc_mode == KD_GRAPHICS))  
		/*
		 * Console switch will fail in console_callback() or
		 * change_console() so there is no point scheduling
		 * the callback
		 *
		 * Existing set_console() users don't check the return
		 * value so this shouldn't break anything
		 */
		return -EINVAL;
	 
	want_console = nr;
	schedule_console_callback();
	return 0;
 
The check for vc->vc_mode == KD_GRAPHICS is where most end-user graphical desktops will bail out of this change, as they re in graphics mode and don t need to switch away to the suspend console. vt_dont_switch is a flag used by the ioctls11 VT_LOCKSWITCH and VT_UNLOCKSWITCH to prevent the system from switching virtual terminal devices when the user has explicitly locked it. VT_AUTO is a flag indicating that automatic virtual terminal switching is enabled12, and thus deliberate switching to a suspend terminal is not required. However, if you do run your machine from a virtual terminal, then we indicate to the system that we want to change to the requested virtual terminal via the want_console variable and schedule a callback via schedule_console_callback. drivers/tty/vt/vt.c:315
void schedule_console_callback(void)
 
	schedule_work(&console_work);
 
console_work is a workqueue2 that will execute the given task asynchronously.

Console Callback drivers/tty/vt/vt.c:3109
/*
 * This is the console switching callback.
 *
 * Doing console switching in a process context allows
 * us to do the switches asynchronously (needed when we want
 * to switch due to a keyboard interrupt).  Synchronization
 * with other console code and prevention of re-entrancy is
 * ensured with console_lock.
 */
static void console_callback(struct work_struct *ignored)
 
	console_lock();
	if (want_console >= 0)  
		if (want_console != fg_console &&
		    vc_cons_allocated(want_console))  
			hide_cursor(vc_cons[fg_console].d);
			change_console(vc_cons[want_console].d);
			/* we only changed when the console had already
			   been allocated - a new console is not created
			   in an interrupt routine */
		 
		want_console = -1;
	 
...
console_callback first looks to see if there is a console change wanted via want_console and then changes to it if it s not the current console and has been allocated already. We do first remove any cursor state with hide_cursor. drivers/tty/vt/vt.c:841
static void hide_cursor(struct vc_data *vc)
 
	if (vc_is_sel(vc))
		clear_selection();
	vc->vc_sw->con_cursor(vc, false);
	hide_softcursor(vc);
 
A full dive into the tty driver is a task for another time, but this should give a general sense of how this system interacts with hibernation.

Notify Power Management Call Chain kernel/power/hibernate.c:767
pm_notifier_call_chain_robust(PM_HIBERNATION_PREPARE, PM_POST_HIBERNATION)
This will call a chain of power management callbacks, passing first PM_HIBERNATION_PREPARE and then PM_POST_HIBERNATION on startup or on error with another callback. kernel/power/main.c:98
int pm_notifier_call_chain_robust(unsigned long val_up, unsigned long val_down)
 
	int ret;
	ret = blocking_notifier_call_chain_robust(&pm_chain_head, val_up, val_down, NULL);
	return notifier_to_errno(ret);
 
The power management notifier is a blocking notifier chain, which means it has the following properties. include/linux/notifier.h:23
 *	Blocking notifier chains: Chain callbacks run in process context.
 *		Callouts are allowed to block.
The callback chain is a linked list with each entry containing a priority and a function to call. The function technically takes in a data value, but it is always NULL for the power management chain. include/linux/notifier.h:49
struct notifier_block;
typedef	int (*notifier_fn_t)(struct notifier_block *nb,
			unsigned long action, void *data);
struct notifier_block  
	notifier_fn_t notifier_call;
	struct notifier_block __rcu *next;
	int priority;
 ;
The head of the linked list is protected by a read-write semaphore. include/linux/notifier.h:65
struct blocking_notifier_head  
	struct rw_semaphore rwsem;
	struct notifier_block __rcu *head;
 ;
Because it is prioritized, appending to the list requires walking it until an item with lower13 priority is found to insert the current item before. kernel/notifier.c:252
/*
 *	Blocking notifier chain routines.  All access to the chain is
 *	synchronized by an rwsem.
 */
static int __blocking_notifier_chain_register(struct blocking_notifier_head *nh,
					      struct notifier_block *n,
					      bool unique_priority)
 
	int ret;
	/*
	 * This code gets used during boot-up, when task switching is
	 * not yet working and interrupts must remain disabled.  At
	 * such times we must not call down_write().
	 */
	if (unlikely(system_state == SYSTEM_BOOTING))
		return notifier_chain_register(&nh->head, n, unique_priority);
	down_write(&nh->rwsem);
	ret = notifier_chain_register(&nh->head, n, unique_priority);
	up_write(&nh->rwsem);
	return ret;
 
kernel/notifier.c:20
/*
 *	Notifier chain core routines.  The exported routines below
 *	are layered on top of these, with appropriate locking added.
 */
static int notifier_chain_register(struct notifier_block **nl,
				   struct notifier_block *n,
				   bool unique_priority)
 
	while ((*nl) != NULL)  
		if (unlikely((*nl) == n))  
			WARN(1, "notifier callback %ps already registered",
			     n->notifier_call);
			return -EEXIST;
		 
		if (n->priority > (*nl)->priority)
			break;
		if (n->priority == (*nl)->priority && unique_priority)
			return -EBUSY;
		nl = &((*nl)->next);
	 
	n->next = *nl;
	rcu_assign_pointer(*nl, n);
	trace_notifier_register((void *)n->notifier_call);
	return 0;
 
Each callback can return one of a series of options. include/linux/notifier.h:18
#define NOTIFY_DONE		0x0000		/* Don't care */
#define NOTIFY_OK		0x0001		/* Suits me */
#define NOTIFY_STOP_MASK	0x8000		/* Don't call further */
#define NOTIFY_BAD		(NOTIFY_STOP_MASK 0x0002)
						/* Bad/Veto action */
When notifying the chain, if a function returns STOP or BAD then the previous parts of the chain are called again with PM_POST_HIBERNATION14 and an error is returned. kernel/notifier.c:107
/**
 * notifier_call_chain_robust - Inform the registered notifiers about an event
 *                              and rollback on error.
 * @nl:		Pointer to head of the blocking notifier chain
 * @val_up:	Value passed unmodified to the notifier function
 * @val_down:	Value passed unmodified to the notifier function when recovering
 *              from an error on @val_up
 * @v:		Pointer passed unmodified to the notifier function
 *
 * NOTE:	It is important the @nl chain doesn't change between the two
 *		invocations of notifier_call_chain() such that we visit the
 *		exact same notifier callbacks; this rules out any RCU usage.
 *
 * Return:	the return value of the @val_up call.
 */
static int notifier_call_chain_robust(struct notifier_block **nl,
				     unsigned long val_up, unsigned long val_down,
				     void *v)
 
	int ret, nr = 0;
	ret = notifier_call_chain(nl, val_up, v, -1, &nr);
	if (ret & NOTIFY_STOP_MASK)
		notifier_call_chain(nl, val_down, v, nr-1, NULL);
	return ret;
 
Each of these callbacks tends to be quite driver-specific, so we ll cease discussion of this here.

Sync Filesystems The next step is to ensure all filesystems have been synchronized to disk. This is performed via a simple helper function that times how long the full synchronize operation, ksys_sync takes. kernel/power/main.c:69
void ksys_sync_helper(void)
 
	ktime_t start;
	long elapsed_msecs;
	start = ktime_get();
	ksys_sync();
	elapsed_msecs = ktime_to_ms(ktime_sub(ktime_get(), start));
	pr_info("Filesystems sync: %ld.%03ld seconds\n",
		elapsed_msecs / MSEC_PER_SEC, elapsed_msecs % MSEC_PER_SEC);
 
EXPORT_SYMBOL_GPL(ksys_sync_helper);
ksys_sync wakes and instructs a set of flusher threads to write out every filesystem, first their inodes15, then the full filesystem, and then finally all block devices, to ensure all pages are written out to disk. fs/sync.c:87
/*
 * Sync everything. We start by waking flusher threads so that most of
 * writeback runs on all devices in parallel. Then we sync all inodes reliably
 * which effectively also waits for all flusher threads to finish doing
 * writeback. At this point all data is on disk so metadata should be stable
 * and we tell filesystems to sync their metadata via ->sync_fs() calls.
 * Finally, we writeout all block devices because some filesystems (e.g. ext2)
 * just write metadata (such as inodes or bitmaps) to block device page cache
 * and do not sync it on their own in ->sync_fs().
 */
void ksys_sync(void)
 
	int nowait = 0, wait = 1;
	wakeup_flusher_threads(WB_REASON_SYNC);
	iterate_supers(sync_inodes_one_sb, NULL);
	iterate_supers(sync_fs_one_sb, &nowait);
	iterate_supers(sync_fs_one_sb, &wait);
	sync_bdevs(false);
	sync_bdevs(true);
	if (unlikely(laptop_mode))
		laptop_sync_completion();
 
It follows an interesting pattern of using iterate_supers to run both sync_inodes_one_sb and then sync_fs_one_sb on each known filesystem16. It also calls both sync_fs_one_sb and sync_bdevs twice, first without waiting for any operations to complete and then again waiting for completion17. When laptop_mode is enabled the system runs additional filesystem synchronization operations after the specified delay without any writes. mm/page-writeback.c:111
/*
 * Flag that puts the machine in "laptop mode". Doubles as a timeout in jiffies:
 * a full sync is triggered after this time elapses without any disk activity.
 */
int laptop_mode;
EXPORT_SYMBOL(laptop_mode);
However, when running a filesystem synchronization operation, the system will add an additional timer to schedule more writes after the laptop_mode delay. We don t want the state of the system to change at all while performing hibernation, so we cancel those timers. mm/page-writeback.c:2198
/*
 * We're in laptop mode and we've just synced. The sync's writes will have
 * caused another writeback to be scheduled by laptop_io_completion.
 * Nothing needs to be written back anymore, so we unschedule the writeback.
 */
void laptop_sync_completion(void)
 
	struct backing_dev_info *bdi;
	rcu_read_lock();
	list_for_each_entry_rcu(bdi, &bdi_list, bdi_list)
		del_timer(&bdi->laptop_mode_wb_timer);
	rcu_read_unlock();
 
As a side note, the ksys_sync function is simply called when the system call sync is used. fs/sync.c:111
SYSCALL_DEFINE0(sync)
 
	ksys_sync();
	return 0;
 

The End of Preparation With that the system has finished preparations for hibernation. This is a somewhat arbitrary cutoff, but next the system will begin a full freeze of userspace to then dump memory out to an image and finally to perform hibernation. All this will be covered in future articles!
  1. Hibernation modes are outside of scope for this article, see the previous article for a high-level description of the different types of hibernation.
  2. Workqueues are a mechanism for running asynchronous tasks. A full description of them is a task for another time, but the kernel documentation on them is available here: https://www.kernel.org/doc/html/v6.9/core-api/workqueue.html 2
  3. This is a bit of an oversimplification, but since this isn t the main focus of this article this description has been kept to a higher level.
  4. Kconfig is Linux s build configuration system that sets many different macros to enable/disable various features.
  5. Kconfig defaults to the first default found
  6. Including checking whether the algorithm is larval? Which appears to indicate that it requires additional setup, but is an interesting choice of name for such a state.
  7. Specifically when we get to process freezing, which we ll get to in the next article in this series.
  8. Swap space is outside the scope of this article, but in short it is a buffer on disk that the kernel uses to store memory not current in use to free up space for other things. See Swap Management for more details.
  9. The code for this is lengthy and tangential, thus it has not been included here. If you re curious about the details of this, see kernel/power/hibernate.c:858 for the details of hibernate_quiet_exec, and drivers/nvdimm/core.c:451 for how it is used in nvdimm.
  10. Annoyingly this code appears to use the terms console and virtual terminal interchangeably.
  11. ioctls are special device-specific I/O operations that permit performing actions outside of the standard file interactions of read/write/seek/etc. 2
  12. I m not entirely clear on how this flag works, this subsystem is particularly complex.
  13. In this case a higher number is higher priority.
  14. Or whatever the caller passes as val_down, but in this case we re specifically looking at how this is used in hibernation.
  15. An inode refers to a particular file or directory within the filesystem. See Wikipedia for more details.
  16. Each active filesystem is registed with the kernel through a structure known as a superblock, which contains references to all the inodes contained within the filesystem, as well as function pointers to perform the various required operations, like sync.
  17. I m including minimal code in this section, as I m not looking to deep dive into the filesystem code at this time.

1 September 2024

Bits from Debian: Bits from the DPL

Dear Debian community, this are my bits from DPL for August. Happy Birthday Debian On 16th of August Debian celebrated its 31th birthday. Since I'm unable to write a better text than our great publicity team I'm simply linking to their article for those who might have missed it: https://bits.debian.org/2024/08/debian-turns-31.html Removing more packages from unstable Helmut Grohne argued for more aggressive package removal and sought consensus on a way forward. He provided six examples of processes where packages that are candidates for removal are consuming valuable person-power. I d like to add that the Bug of the Day initiative (see below) also frequently encounters long-unmaintained packages with popcon votes sometimes as low as zero, and often fewer than ten. Helmut's email included a list of packages that would meet the suggested removal criteria. There was some discussion about whether a popcon vote should be included in these criteria, with arguments both for and against it. Although I support including popcon, I acknowledge that Helmut has a valid point in suggesting it be left out. While I ve read several emails in agreement, Scott Kitterman made a valid point "I don't think we need more process. We just need someone to do the work of finding the packages and filing the bugs." I agree that this is crucial to ensure an automated process doesn t lead to unwanted removals. However, I don t see "someone" stepping up to file RM bugs against other maintainers' packages. As long as we have strict ownership of packages, many people are hesitant to touch a package, even for fixing it. Asking for its removal might be even less well-received. Therefore, if an automated procedure were to create RM bugs based on defined criteria, it could help reduce some of the social pressure. In this aspect the opinion of Niels Thykier is interesting: "As much as I want automation, I do not mind the prototype starting as a semi-automatic process if that is what it takes to get started." The urgency of the problem to remove packages was put by CharlesPlessy into the words: "So as of today, it is much less work to keep a package rotting than removing it." My observation when trying to fix the Bug of the Day exactly fits this statement. I would love for this discussion to lead to more aggressive removals that we can agree upon, whether they are automated, semi-automated, or managed by a person processing an automatically generated list (supported by an objective procedure). To use an analogy: I ve found that every image collection improves with aggressive pruning. Similarly, I m convinced that Debian will improve if we remove packages that no longer serve our users well. DEP14 / DEP18 There are two DEPs that affect our workflow for maintaining packages particularly for those who agree on using Git for Debian packages. DEP-14 recommends a standardized layout for Git packaging repositories, which benefits maintainers working across teams and makes it easier for newcomers to learn a consistent repository structure. DEP-14 stalled for various reasons. Sam Hartman suspected it might be because 'it doesn't bring sufficient value.' However, the assumption that git-buildpackage is incompatible with DEP-14 is incorrect, as confirmed by its author, Guido G nther. As one of the two key tools for Debian Git repositories (besides dgit) fully supports DEP-14, though the migration from the previous default is somewhat complex. Some investigation into mass-converting older formats to DEP-14 was conducted by the Perl team, as Gregor Hermann pointed out.. The discussion about DEP-14 resurfaced with the suggestion of DEP-18. Guido G nther proposed the title Encourage Continuous Integration and Merge Request-Based Collaboration for Debian Packages , which more accurately reflects the DEP's technical intent. Otto Kek l inen, who initiated DEP-18 (thank you, Otto), provided a good summary of the current status. He also assembled a very helpful overview of Git and GitLab usage in other Linux distros. More Salsa CI As a result of the DEP-18 discussion, Otto Kek l inen suggested implementing Salsa CI for our top popcon packages. I believe it would be a good idea to enable CI by default across Salsa whenever a new repository is created. Progress in Salsa migration In my campaign, I stated that I aim to reduce the number of packages maintained outside Salsa to below 2,000. As of March 28, 2024, the count was 2,368. Today, it stands at 2,187 (UDD query: SELECT DISTINCT count(*) FROM sources WHERE release = 'sid' and vcs_url not like '%salsa%' ;). After a third of my DPL term (OMG), we've made significant progress, reducing the amount in question (369 packages) by nearly half. I'm pleased with the support from the DDs who moved their packages to Salsa. Some packages were transferred as part of the Bug of the Day initiative (see below). Bug of the Day As announced in my 'Bits from the DPL' talk at DebConf, I started an initiative called Bug of the Day. The goal is to train newcomers in bug triaging by enabling them to tackle small, self-contained QA tasks. We have consistently identified target packages and resolved at least one bug per day, often addressing multiple bugs in a single package. In several cases, we followed the Package Salvaging procedure outlined in the Developers Reference. Most instances were either welcomed by the maintainer or did not elicit a response. Unfortunately, there was one exception where the recipient of the Package Salvage bug expressed significant dissatisfaction. The takeaway is to balance formal procedures with consideration for the recipient s perspective. I'm pleased to confirm that the Matrix channel has seen an increase in active contributors. This aligns with my hope that our efforts would attract individuals interested in QA work. I m particularly pleased that, within just one month, we have had help with both fixing bugs and improving the code that aids in bug selection. As I aim to introduce newcomers to various teams within Debian, I also take the opportunity to learn about each team's specific policies myself. I rely on team members' assistance to adapt to these policies. I find that gaining this practical insight into team dynamics is an effective way to understand the different teams within Debian as DPL. Another finding from this initiative, which aligns with my goal as DPL, is that many of the packages we addressed are already on Salsa but have not been uploaded, meaning their VCS fields are not published. This suggests that maintainers are generally open to managing their packages on Salsa. For packages that were not yet on Salsa, the move was generally welcomed. Publicity team wants you The publicity team has decided to resume regular meetings to coordinate their efforts. Given my high regard for their work, I plan to attend their meetings as frequently as possible, which I began doing with the first IRC meeting. During discussions with some team members, I learned that the team could use additional help. If anyone interested in supporting Debian with non-packaging tasks reads this, please consider introducing yourself to debian-publicity@lists.debian.org. Note that this is a publicly archived mailing list, so it's not the best place for sharing private information. Kind regards Andreas.

Next.