Search Results: "Gunnar Wolf"

19 November 2025

Gunnar Wolf: While it is cold-ish season in the North hemisphere...

Last week, our university held a Mega Vaccination Center . Things cannot be small or regular with my university, ever! According to the official information, during last week 31,000 people were given a total of 74,000 vaccine dosis against influenza, COVID-19, pneumococcal disease and measles (specific vaccines for each person selected according to an age profile). I was a tiny blip in said numbers. One person, three shots. Took me three hours, but am quite happy to have been among the huge crowd. Long, long line ( photo credit: La Jornada, 2025.11.14) Really vaccinated! And why am I bringing this up? Because I have long been involved in organizing DebConf, the best conference ever, naturally devoted to improving Debian GNU/Linux. And last year, our COVID reaction procedures ended up hurting people we care about. We, as organizers, are taking it seriously to shape a humane COVID handling policy that is, at the same time, responsible and respectful for people who are (reasonably!) afraid to catch the infection. No, COVID did not disappear in 2022, and its effects are not something we can turn a blind eye to. Next year, DebConf will take place in Santa Fe, Argentina, in July. This means, it will be a Winter DebConf. And while you can catch COVID (or influenza, or just a bad cold) at any time of year, odds are a bit higher. I know not every country still administers free COVID or influenza vaccines to anybody who requests them. And I know that any protection I might have got now will be quite weaker by July. But I feel it necessary to ask of everyone who can get it to get a shot. Most Northern Hemisphere countries will have a vaccination campaign (or at least, higher vaccine availability) before Winter. If you plan to attend DebConf (hell If you plan to attend any massive gathering of people travelling from all over the world to sit at a crowded auditorium) during the next year, please Act responsibly. For yourself and for those surrounding you. Get vaccinated. It won t absolutely save you from catching it, but it will reduce the probability. And if you do catch it, you will probably have a much milder version. And thus, you will spread it less during the first days until (and if!) you start developing symptoms.

14 November 2025

Gunnar Wolf: 404 not found

Found this grafitti on the wall behind my house today: 404 not found!

21 October 2025

Gunnar Wolf: LLM Hallucinations in Practical Code Generation Phenomena, Mechanism, and Mitigation

This post is a review for Computing Reviews for LLM Hallucinations in Practical Code Generation Phenomena, Mechanism, and Mitigation , a article published in Proceedings of the ACM on Software Engineering, Volume 2, Issue ISSTA
How good can large language models (LLMs) be at generating code? This may not seem like a very novel question, as several benchmarks (for example, HumanEval and MBPP, published in 2021) existed before LLMs burst into public view and started the current artificial intelligence (AI) inflation. However, as the paper s authors point out, code generation is very seldom done as an isolated function, but instead must be deployed in a coherent fashion together with the rest of the project or repository it is meant to be integrated into. Today, several benchmarks (for example, CoderEval or EvoCodeBench) measure the functional correctness of LLM-generated code via test case pass rates. This paper brings a new proposal to the table: comparing LLM-generated repository-level evaluated code by examining the hallucinations generated. The authors begin by running the Python code generation tasks proposed in the CoderEval benchmark against six code-generating LLMs. Next, they analyze the results and build a taxonomy to describe code-based LLM hallucinations, with three types of conflicts (task requirement, factual knowledge, and project context) as first-level categories and eight subcategories within them. The authors then compare the results of each of the LLMs per the main hallucination category. Finally, they try to find the root cause for the hallucinations. The paper is structured very clearly, not only presenting the three research questions (RQ) but also referring to them as needed to explain why and how each partial result is interpreted. RQ1 (establishing a hallucination taxonomy) is the most thoroughly explored. While RQ2 (LLM comparison) is clear, it just presents straightforward results without much analysis. RQ3 (root cause discussion) is undoubtedly interesting, but I feel it to be much more speculative and not directly related to the analysis performed. After tackling their research questions, Zhang et al. propose a possible mitigation to counter the effect of hallucinations: enhance the LLM with retrieval-augmented generation (RAG) so it better understands task requirements, factual knowledge, and project context. The presented results show that all of the models are clearly (though modestly) improved by the proposed RAG-based mitigation. The paper is clearly written and easy to read. It should provide its target audience with interesting insights and discussions. I would have liked more details on their RAG implementation, but I suppose that s for a follow-up work.

14 October 2025

Gunnar Wolf: Can a server be just too stable?

One of my servers at work leads a very light life: it is our main backups server (so it has a I/O spike at night, with little CPU involvement) and has some minor services running (i.e. a couple of Tor relays and my personal email server yes, I have the authorization for it ). It is a very stable machine But today I was surprised: As I am about to migrate it to Debian 13 (Trixie), naturally, I am set to reboot it. But before doing so:
$ w
 12:21:54 up 1048 days, 0 min,  1 user,  load average: 0.22, 0.17, 0.17
USER     TTY      FROM             LOGIN@   IDLE   JCPU   PCPU  WHAT
gwolf             192.168.10.3     12:21           0.00s  0.02s sshd-session: gwolf [priv]
Wow. Did I really last reboot this server on December 1 2022? (Yes, I know this might speak bad of my security practices, as there are several kernel updates I never applied, even having installed the relevant packages. Still, it got me impressed ) Debian. Rock solid. Debian Rocks

10 October 2025

Reproducible Builds: Reproducible Builds in September 2025

Welcome to the September 2025 report from the Reproducible Builds project! Welcome to the very latest report from the Reproducible Builds project. Our monthly reports outline what we ve been up to over the past month, and highlight items of news from elsewhere in the increasingly-important area of software supply-chain security. As ever, if you are interested in contributing to the Reproducible Builds project, please see the Contribute page on our website. In this report:

  1. Reproducible Builds Summit 2025
  2. Can t we have nice things?
  3. Distribution work
  4. Tool development
  5. Reproducibility testing framework
  6. Upstream patches

Reproducible Builds Summit 2025 Please join us at the upcoming Reproducible Builds Summit, set to take place from October 28th 30th 2025 in Vienna, Austria! We are thrilled to host the eighth edition of this exciting event, following the success of previous summits in various iconic locations around the world, including Venice, Marrakesh, Paris, Berlin, Hamburg and Athens. Our summits are a unique gathering that brings together attendees from diverse projects, united by a shared vision of advancing the Reproducible Builds effort. During this enriching event, participants will have the opportunity to engage in discussions, establish connections and exchange ideas to drive progress in this vital field. Our aim is to create an inclusive space that fosters collaboration, innovation and problem-solving. If you re interesting in joining us this year, please make sure to read the event page which has more details about the event and location. Registration is open until 20th September 2025, and we are very much looking forward to seeing many readers of these reports there!

Can t we have nice things? Debian Developer Gunnar Wolf blogged that George V. Neville-Neil s Kode Vicious column in Communications of the ACM in which reproducible builds is mentioned without needing to introduce it (assuming familiarity across the computing industry and academia) . Titled, Can t we have nice things?, the article mentions:
Once the proper measurement points are known, we want to constrain the system such that what it does is simple enough to understand and easy to repeat. It is quite telling that the push for software that enables reproducible builds only really took off after an embarrassing widespread security issue ended up affecting the entire Internet. That there had already been 50 years of software development before anyone thought that introducing a few constraints might be a good idea is, well, let s just say it generates many emotions, none of them happy, fuzzy ones. [ ]

Distribution work In Debian this month, Johannes Starosta filed a bug against the debian-repro-status package, reporting that it does not work on Debian trixie. (An upstream bug report was also filed.) Furthermore, 17 reviews of Debian packages were added, 10 were updated and 14 were removed this month adding to our knowledge about identified issues. In March s report, we included the news that Fedora would aim for 99% package reproducibility. This change has now been deferred to Fedora 44 according to Phoronix. Lastly, Bernhard M. Wiedemann posted another openSUSE monthly update for their work there.

Tool development diffoscope version 306 was uploaded to Debian unstable by Chris Lamb. It included contributions already covered in previous months as well as some changes by Zbigniew J drzejewski-Szmek to address issues with the fdtump support [ ] and to move away from the deprecated codes.open method. [ ][ ] strip-nondeterminism version 1.15.0-1 was uploaded to Debian unstable by Chris Lamb. It included a contribution by Matwey Kornilov to add support for inline archive files for Erlang s escript [ ]. kpcyrd has released a new version of rebuilderd. As a quick recap, rebuilderd is an automatic build scheduler that tracks binary packages available in a Linux distribution and attempts to compile the official binary packages from their (purported) source code and dependencies. The code for in-toto attestations has been reworked, and the instances now feature a new endpoint that can be queried to fetch the list of public-keys an instance currently identifies itself by. [ ] Lastly, Holger Levsen bumped the Standards-Version field of disorderfs, with no changes needed. [ ][ ]

Reproducibility testing framework The Reproducible Builds project operates a comprehensive testing framework running primarily at tests.reproducible-builds.org in order to check packages and other artifacts for reproducibility. In August, however, a number of changes were made by Holger Levsen, including:
  • Setting up six new rebuilderd workers with 16 cores and 16 GB RAM each.
  • reproduce.debian.net-related:
    • Do not expose pending jobs; they are confusing without explaination. [ ]
    • Add a link to v1 API specification. [ ]
    • Drop rebuilderd-worker.conf on a node. [ ]
    • Allow manual scheduling for any architectures. [ ]
    • Update path to trixie graphs. [ ]
    • Use the same rebuilder-debian.sh script for all hosts. [ ]
    • Add all other suites to all other archs. [ ][ ][ ][ ]
    • Update SSH host keys for new hosts. [ ]
    • Move to the pull184 branch. [ ][ ][ ][ ][ ]
    • Only allow 20 GB cache for workers. [ ]
  • OpenWrt-related:
    • Grant developer aparcar full sudo control on the ionos30 node. [ ][ ]
  • Jenkins nodes:
    • Add a number of new nodes. [ ][ ][ ][ ][ ]
    • Dont expect /srv/workspace to exist on OSUOSL nodes. [ ]
    • Stop hardcoding IP addresses in munin.conf. [ ]
    • Add maintenance and health check jobs for new nodes. [ ]
    • Document slight changes in IONOS resources usage. [ ]
  • Misc:
    • Drop disabled Alpine Linux tests for good. [ ]
    • Move Debian live builds and some other Debian builds to the ionos10 node. [ ]
    • Cleanup some legacy support from releases before Debian trixie. [ ]
In addition, Jochen Sprickerhof made the following changes relating to reproduce.debian.net:
  • Do not expose pending jobs on the main site. [ ]
  • Switch the frontpage to reference Debian forky [ ], but do not attempt to build Debian forky on the armel architecture [ ].
  • Use consistent and up to date rebuilder-debian.sh script. [ ]
  • Fix supported worker architectures. [ ]
  • Add a basic excuses page. [ ]
  • Move to the pull184 branch. [ ][ ][ ][ ]
  • Fix a typo in the JavaScript. [ ]
  • Update front page for the new v1 API. [ ][ ]
Lastly, Roland Clobus did some maintenance relating to the reproducibility testing of the Debian Live images. [ ][ ][ ][ ]

Upstream patches The Reproducible Builds project detects, dissects and attempts to fix as many currently-unreproducible packages as possible. We endeavour to send all of our patches upstream where appropriate. This month, we wrote a large number of such patches, including:

Finally, if you are interested in contributing to the Reproducible Builds project, please visit our Contribute page on our website. However, you can get in touch with us via:

21 September 2025

Gunnar Wolf: We, Programmers A Chronicle of Coders from Ada to AI

This post is a review for Computing Reviews for We, Programmers A Chronicle of Coders from Ada to AI , a book published in Addison-Wesley
When this book was presented as available for review, I jumped on it. After all, who doesn t love reading a nice bit of computing history, as told by a well-known author (affectionaly known as Uncle Bob ), one who has been immersed in computing since forever? What s not to like there? Reading on, the book does not disappoint. Much to the contrary, it digs into details absent in most computer history books that, being an operating systems and computer architecture geek, I absolutely enjoyed. But let me first address the book s organization. The book is split into four parts. Part 1, Setting the Stage, is a short introduction, answering the question Who are we? ( we being the programmers, of course). It describes the fascination many of us felt when we realized that the computer was there to obey us, to do our bidding, and we could absolutely control it. Part 2 talks about the giants of the computing world, on whose shoulders we stand. It digs in with a level of detail I have never seen before, discussing their personal lives and technical contributions (as well as the hoops they had to jump through to get their work done). Nine chapters cover these giants, ranging chronologically from Charles Babbage and Ada Lovelace to Ken Thompson, Dennis Richie, and Brian Kernighan (understandably, giants who worked together are grouped in the same chapter). This is the part with the most historically overlooked technical details. For example, what was the word size in the first computers, before even the concept of a byte had been brought into regular use? What was the register structure of early central processing units (CPUs), and why did it lead to requiring self-modifying code to be able to execute loops? Then, just as Unix and C get invented, Part 3 skips to computer history as seen through the eyes of Uncle Bob. I must admit, while the change of rhythm initially startled me, it ends up working quite well. The focus is no longer on the giants of the field, but on one particular person (who casts a very long shadow). The narrative follows the author s life: a boy with access to electronics due to his father s line of work; a computing industry leader, in the early 2000s, with extreme programming; one of the first producers of training materials in video format a role that today might be recognized as an influencer. This first-person narrative reaches year 2023. But the book is not just a historical overview of the computing world, of course. Uncle Bob includes a final section with his thoughts on the future of computing. As this is a book for programmers, it is fitting to start with the changes in programming languages that we should expect to see and where such changes are likely to take place. The unavoidable topic of artificial intelligence is presented next: What is it and what does it spell for computing, and in particular for programming? Interesting (and sometimes surprising) questions follow: What does the future of hardware development look like? What is prone to be the evolution of the World Wide Web? What is the future of programming and programmers? At just under 500 pages, the book is a volume to be taken seriously. But space is very well used with this text. The material is easy to read, often funny and always informative. If you enjoy computer history and understanding the little details in the implementations, it might very well be the book you want.

18 September 2025

Gunnar Wolf: Still use Twitter/X? Consider dropping it...

Many people that were once enthusiast Twitter users have dropped as a direct or indirect effect of its ownership change and the following policy changes. Given Twitter X is getting each time more irrelevant, it is less interesting and enciting for more and more people But also, its current core users (mostly, hate-apologists of the right-wing mindset that finds conspiration theories everywhere) are becoming more commonplace, and by sheer probability (if not for algorithmic bias), every time it becomes more likely a given piece of content will be linked to what their authors would classify as crap. So, there has been in effect an X exodus. This has been reported in media outlets as important as Reuters, or The Guardianresearch institutes such as Berkeley, even media that no matter how hard you push cannot be identified as the radical left Mr. Trump is so happy to blame for everything, such as Forbes Today I read a short note in a magazine I very much enjoy, Communications of the ACM, where SIGDOC (the ACM s Special Interest Group on Design of Communication) is officially closing their X account. The reasoning is crystal clear. They have the mission to create and study User Experience (UX) implementations and report on it, focused on making communication clearer and more human centered . That is no longer, for many reasons, a goal that can be furthered by the means of an X account. (BTW, and How many people are actually angry that Mr. Musk took the X11 old logo and made it his? I am sure it is now protected under too many layers of legalese, even though I am aware of it since at least 30 years ago )

11 September 2025

Gunnar Wolf: Saying _hi_ to my good Reproducible Builds friends while reading a magazine article

Just wanted to share I enjoy reading George V. Neville s Kode Vicious column, which regularly appears on some of ACM s publications I follow, such as ACM Queue or Communications. Today I was very pleasantly surprised, while reading the column titled Can t we have nice things Kode Vicious answers to a question on why computing has nothing comparable to the beauty of ancient physics laboratories turned into museums (i.e. Faraday s laboratory) by giving a great hat tip to a project stemmed off Debian, and where many of my good Debian friends spend a lot of their energies: Reproducible builds. KV says:
Once the proper measurement points are known, we want to constrain the system such that what it does is simple enough to understand and easy to repeat. It is quite telling that the push for software that enables reproducible builds only really took off after an embarrassing widespread security issue ended up affecting the entire Internet. That there had already been 50 years of software development before anyone thought that introducing a few constraints might be a good idea is, well, let s just say it generates many emotions, none of them happy, fuzzy ones.
Yes, KV is a seasoned free software author. But I found it heart warming that the Reproducible Builds project is mentioned without needing to introduce it (assuming familiarity across the computing industry and academia), recognized as game-changing as we understood it would be over ten years ago when it was first announced, and enabling of beauty in computing. Congratulations to all of you who have made this possible! RB+ACM

25 August 2025

Gunnar Wolf: The comedy of computation, or, how I learned to stop worrying and love obsolescence

This post is a review for Computing Reviews for The comedy of computation, or, how I learned to stop worrying and love obsolescence , a book published in Stanford University Press
The Comedy of Computation is not an easy book to review. It is a much enjoyable book that analyzes several examples of how being computational has been approached across literary genres in the last century how authors of stories, novels, theatrical plays and movies, focusing on comedic genres, have understood the role of the computer in defining human relations, reactions and even self-image. Mangrum structures his work in six thematic chapters, where he presents different angles on human society: How have racial stereotypes advanced in human imagination and perception about a future where we interact with mechanical or computational partners (from mechanical tools performing jobs that were identified with racial profiles to intelligent robots that threaten to control society); the genericity of computers and people can be seen as generic, interchangeable characters, often fueled by the tendency people exhibit to confer anthropomorphic qualities to inanimate objects; people s desire to be seen as truly authentic , regardless of what it ultimately means; romantic involvement and romance-led stories (with the computer seen as a facilitator for human-to-human romances, distractor away from them, or being itself a part of the couple); and the absurdity in antropomorphization, in comparing fundamentally different aspects such as intelligence and speed at solving mathematical operations, as well as the absurdity presented blatantly as such by several techno-utopian visions. But presenting this as a linear set of concepts that are presented does not do justice to the book. Throughout the sections of each chapter, a different work serves as the axis Novels and stories, Hollywood movies, Broadway plays, some covers for the Time magazine, a couple of presenting the would-be future, even a romantic comedy entirely written by bots . And for each of them, Benjamin Mangrum presents a very thorough analysis, drawing relations and comparing with contemporary works, but also with Shakespeare, classical Greek myths, and a very long etc tera. This book is hard to review because of the depth of work the author did: Reading it repeatedly made me look for other works, or at least longer references for them. Still, despite being a work with such erudition, Mangrum s text is easy and pleasant to read, without feeling heavy or written in an overly academic style. I very much enjoyed reading this book. It is certainly not a technical book about computers and society in any way; it is an exploration of human creativity and our understanding of the aspects the author has found as central to understanding the impact of computing on humankind. However, there is one point I must mention before closing: I believe the editorial decision to present the work as a running text, with all the material conceptualized as footnotes presented as a separate, over 50 page long final chapter, detracts from the final result. Personally, I enjoy reading the footnotes because they reveal the author s thought processes, even if they stray from the central line of thought. Even more, given my review copy was a PDF, I could not even keep said chapter open with one finger, bouncing back and forth. For all purposes, I missed out on the notes; now that I finished reading and stumbled upon that chapter, I know I missed an important part of the enjoyment.

17 July 2025

Gunnar Wolf: About our proof-of-concept LLM tool for navigating Debian's manpages

So, for the first year, this year s DebConf had the DebConf Academic Track , that is, content for a one-day-long set of short sessions, for which of them there was a written article presenting the content often with a very academic format, but not necessarily. I hope that for future DebConfs we will continue to hold this track, and that we can help bridge the gap: to get people that are not usually from the academic / universitary prepare content that will be formally expressed and included in a long-lasting, indexed book of proceedings. We did have (informal) proceedings in many of the early DebConfs, and I m very happy to regain that, but with 20 years of better practices. Anyway, of course, I took the opportunity to join this experiment, and together with my Mexican colleague Tzolkin Gardu o who is finishing her PhD here in France (or should I say, second to her, as she is the true leading author of our work). And here you can see the paper we submitted to the DebConf Academic Track, which will be published soon: A retrieval-augmented-generation pipeline to help users query system-provided documentation The corresponding source code is all available at Tzolkin s repository in GitHub. So, what is it about, in shorter words and layman terms? Debian has lots of documentation, but lacks in discoverability. We targetted our venerable manpages: It is well known that manpages are relevant, well-organized documentation describing how to use each of the binaries we ship in Debian. Eric Raymond wrote in his well-known essay The Art of Unix Programming (2003) that the Unix cultural style is telegraphic but complete. It does not hold you by the hand, but it usualy points in the right direction. Our original intent was to digest all of the information in our manpages, but we narrowed it to only the first section of the manual due to the limitations of the hardware a good friend lent us to play with LLMs. We took four different base, redistributable (although, yes, non-DFSG-free) Large Language Models downloaded from HuggingFace (T5-small, MiniLM-L6-v2, BGE-small-en and Multilingual-e5-small), and trained them with the 34579 pages found inside /usr/share/man/man1 of all of the existing Debian packages. We did some interesting fine-tuning (for further details, please refer to the paper itself or to our GitHub repository. The idea is to present an interactive tool that udnerstand natural language queries, and answers with the specific manpage to which they better relate (I d like to say they answer best , but Tzolkin has repeatedly tried to correct my understanding of the resulting vectorial distances). I had prepared some images to present as interaction examples, but I ll wrap up this post with something even better So, this is what you would get with the following queries: We were happy to present like this. During DebCamp, however, I was able to devote some time to translating my client into a Web-accessible system. Do note that it s a bit brittle, and might break now and then. But you are welcome to play with it! Play with the Web interface for our RAG for Debian manuals! I find it worth sharing that, while we used quite a bit of GPU for the training (not too much a couple of nights on an old-although-powerful nVidia 1070 card lent to us by the Felipe Esquivel Fund for Science and Cultural Support), all querying takes place in the CPU, and I usually get between 0.1 and 0.3 seconds per query. My server s architecture is far from rock-solid, and it can break now and then but at least it should respawn upon failure And it should at least be available there for a couple of weeks into August. Anyway, this has been a very interesting journey getting my toes wet in the waters of LLM, and while current results are a bit crude, I think this can be made into an interesting tool for Debian exploration.

30 June 2025

Gunnar Wolf: Get your personalized map of DebConf25 in Brest

As I often do, this year I have also prepared a set of personalized maps for your OpenPGP keysigning in DebConf25, in Brest! What is that, dare you ask? Partial view of my OpenPGP map One of the not-to-be-missed traditions of DebConf is a Key-Signing Party (KSP) that spans the whole conference! Travelling from all the corners of the world to a single, large group gathering, we have the ideal opportunity to spread some communicable diseases trust on your peers identities and strengthen Debian s OpenPGP keyring. But whom should you approach for keysigning? Go find yourself in the nice listing I have prepared. By clicking on your long keyid (in my case, the link labeled 0x2404C9546E145360), anybody can download your certificate (public key + signatures). The SVG and PNG links will yield a graphic version of your position within the DC25 keyring, and the TXT link will give you a textual explanation of it. (of course, your links will differ, yada yada ) Please note this is still a preview of our KSP information: You will notice there are outstanding several things for me to fix before marking the file as final. First, some names have encoding issues I will fix. Second, some keys might be missing if you submitted your key as part of the conference registration form but it is not showing, it must be because my scripts didn t find it in any of the queried keyservers. My scripts are querying the following servers:
hkps://keyring.debian.org/
hkps://keys.openpgp.org/
hkps://keyserver.computer42.org/
hkps://keyserver.ubuntu.com/
hkps://pgp.mit.edu/
hkps://pgp.pm/
hkps://pgp.surf.nl/
hkps://pgpkeys.eu/
hkps://the.earth.li/
Make sure your key is available in at least some of them; I will try to do a further run on Friday, before travelling, or shortly after arriving to France. If you didn t submit your key in time, but you will be at DC25, please mail me stating [DC25 KSP] in your mail title, and I will manually add it to the list. On (hopefully!) Friday, I ll post the final, canonical KSP coordination page which you should download and calculate its SHA256-sum. We will have printed out convenience sheets to help you do your keysigning at the front desk.

23 June 2025

Gunnar Wolf: Private key management Oh, the humanity...

If we ever thought a couple of years or decades of constant use would get humankind to understand how an asymetric key pair is to be handled It s time we moved back to square one. I had to do an online tramit with the Mexican federal government to get a statement certifying I successfully finished my studies, and I found this jewel of user interface: E.firma So I have to:
  1. Submit the asymetric key I use for tax purposes, as that s the ID the government has registered for me. OK, I didn t expect it to be used for this purpose as well, but I ll accept it. Of course, in our tax system many people don t require having a public key generated ( easier regimes are authenticated by password only), but all professionals with a c dula profesional (everybody getting a unviersitary title) is now compelled to do this step.
  2. Not only I have to submit my certificate (public key) But also the private part (and, of course, the password that secures it). I understand I m interacting with a Javascript thingie that runs only client-side, and I trust it is not shipping my private key to their servers. But given it is an opaque script, I have no assurance about it. And, of course, this irks me because I am who I am and because I ve spent several years thinking about cryptography. But for regular people, it just looks as a stupid inconvenience: they have to upload two weird files with odd names and provide a password. What for?
This is beyond stupid. I m baffled. (of course, I did it, because I need the fsckin document. Oh, and of course, I paid my MX$1770, 80, for it which does not make me too happy for a tramit that s not even shuffling papers, only storing the right bits in the right corner of the right datacenter, but anyhow )

19 June 2025

Debian Outreach Team: GSoC 2025 Introduction: Make Debian for Raspberry Pi Build Again

Hello everyone! I am Kurva Prashanth, Interested in the lower level working of system software, CPUs/SoCs and Hardware design. I was introduced to Open Hardware and Embedded Linux while studying electronics and embedded systems as part of robotics coursework. Initially, I did not pay much attention to it and quickly moved on. However, a short talk on Liberating SBCs using Debian by Yuvraj at MiniDebConf India, 2021 caught my interest. The talk focused on Open Hardware platforms such as Olimex and BeagleBone Black, as well as the Debian distributions tailored for these ARM-based single-board computers has intrigued me to delve deeper into the realm of Open Hardware and Embedded Linux. These days I m trying to improve my abilities to contribute to Debian and Linux Kernel development. Before finding out about the Google Summer of Code project, I had already started my journey with Debian. I extensively used Debian system build tools(debootstrap, sbuild, deb-build-pkg, qemu-debootstrap) for Building Debian Image for Bela Cape a real-time OS for music making to achieve extremely fast audio and sensor processing times. In 2023, I had the opportunity to attend DebConf23 in Kochi, India - thanks to Nilesh Patra (@nilesh) and I met Hector Oron (@zumbi) over dinner at DebConf23 and It was nice talking about his contributions/work at Debian on armhf port and Debian System Administration that conversation got me interested in knowing more about Debian ARM, Installer and I found it fascinating that EmDebian was once a external project bringing Debian to embedded systems and now, Debian itself can be run on many embedded systems. And, also during DebCamp I got Introduced to PGP/GPG keys and the web of trust by Carlos Henrique Lima Melara (@charles) I learned how to use and generate GPG keys. After DebConf23 I tried debian packaging and I miserably failed to get sponsorship for a python library I packaged. I came across the Debian project for this year s Google Summer of Code and found the project titled Make Debian for Raspberry Pi Build Again quite interesting to me and applied. Gladly, on May 8th, I received an acceptance e-mail from GSoC. I got excited that I ll spend the summer working on something that I like doing. I am thrilled to be part of this project and I am super excited for the summer of 25. I m looking forward to work on what I most like, new connections and learning opportunities. So, let me talk a bit more about my project. I will be working on to Make Debian for Raspberry Pi SBC s under the guidance of Gunnar Wolf (@gwolf). In this post, I will describe the project I will be working on.

Why make Debian for Raspberry Pi build again? There is an available set of images for running Debian in Raspberry Pi computers (all models below the 5 series)! However, the maintainer severely lacking time to take care for them; called for help for somebody to adopt them, but have not been successful. The image generation scripts might have bitrotted a bit, but it is mostly all done. And there is a lot of interest and use still in having the images freshly generated and decently tested! This GSoC project is about getting the [https://raspi.debian.net/ Raspberry Pi Debian images] site working reliably, daily-built images become automatic again and ideally making it easily deployable to be run in project machines and migrating exsisting hosting infrastructure to Debian.

How much it differ from Debian build process? While the goal is to stay as close as possible to the Debian build process, Raspberry Pi boards require some necessary platform-specific changes primarily in the early boot sequence and firmware handling. Unlike typical Debian systems, Raspberry Pi boards depend on a non-standard bootloader and use non-free firmware (raspi-firmware), Introducing some hardware-specific differences in the initialization process. These differences are largely confined to the early boot and hardware initialization stages. Once the system boots, the userspace remains closely aligned with a typical Debian install, using Debian packages. The current modifications are required due to non-free firmware. However, several areas merit review: but there are a few parts that might be worth changing.
  1. Boot flow: Transitioning to a U-Boot based boot process (as used in Debian installer images for many other SBCs) would reduce divergence and better align with Debian Installer.
  2. Current scripts/workarounds: Some existing hacks may now be redundant with recent upstream support and could be removed.
  3. Board-specific images: Shift to architecture-specific base images with runtime detection could simplify builds and reduce duplication.
Debian already already building SD card images for a wide range of SBCs (e.g., BeagleBone, BananaPi, OLinuXino, Cubieboard, etc.) installer-arm64/images/u-boot and installer-armhf/images/u-boot, a similar approach for Raspberry Pi could improve maintainability and consistency with Debian s broader SBC support.

Quoted from Mail Discussion Thread with Mentor (Gunnar Wolf)
"One direction we wanted to explore was whether we should still be building one image per family, or whether we could instead switch to one image per architecture (armel, armhf, arm64). There were some details to iron out as RPi3 and RPi4 were quite different, but I think it will be similar to the differences between the RPi 0 and 1, which are handled at first-boot time. To understand what differs between families, take a look at Cyril Brulebois generate-recipe (in the repo), which is a great improvement over the ugly mess I had before he contributed it"
In this project, I intend to to build one image per architecture (armel, armhf, arm64) rather than continuing with the current model of building one image per board. This change simplifies image management, reduces redundancy, and leverages dynamic configuration at boot time to support all supported boards within each architecture. By using U-Boot and flash-kernel, we can detect the board type and configure kernel parameters, DTBs, and firmware during the first boot, reducing duplication across images and simplifying the maintenance burden and we can also generalize image creation while still supporting board-specific behavior at runtime. This method aligns with existing practices in the DebianInstaller team and aligns with Debian s long-term maintainability goals and better leverages upstream capabilities, ensuring a consistent and scalable boot experience. To streamline and standardize the process of building bootable Debian images for Raspberry Pi devices, I proposed a new workflow that leverages U-Boot and flash-kernel Debian packages. This provides a clean, maintainable, and reproducible way to generate images for armel, armhf and arm64 boards. The workflow is vmdb2, a lightweight, declarative tool designed to automate the creation of disk images. A typical vmdb2 recipe defines the disk layout, base system installation (via debootstrap), architecture-specific packages, and any custom post-install hooks and the image should includes U-Boot (the u-boot-rpi package), flash-kernel, and a suitable Debian kernel package like linux-image-arm64 or linux-image-armmp. U-Boot serves as the platform s bootloader and is responsible for loading the kernel and initramfs. Unlike Raspberry Pi s non-free firmware/proprietary bootloader, U-Boot provides an open and scriptable interface, allowing us to follow a more standard Debian boot process. It can be configured to boot using either an extlinux.conf or a boot.scr script generated automatically by flash-kernel. The role of flash-kernel is to bridge Debian s kernel installation system with the specifics of embedded bootloaders like U-Boot. When installed, it automatically copies the kernel image, initrd, and device tree blobs (DTBs) to the /boot partition. It also generates the necessary boot.scr script if the board configuration demands it. To work correctly, flash-kernel requires that the target machine be identified via /etc/flash-kernel/machine, which must correspond to an entry in its internal machine database.\ Once the vmdb2 build is complete, the resulting image will contain a fully configured bootable system with all necessary boot components correctly installed. The image can be flashed to an SD card and used to boot on the intended device without additional manual configuration. Because all key packages (U-Boot, kernel, flash-kernel) are managed through Debian s package system, kernel updates and boot script regeneration are handled automatically during system upgrades.

Current Workflow: Builds one Image per family The current vmdb2 recipe uses the Raspberry Pi GPU bootloader provided via the raspi-firmware package. This is the traditional boot process followed by Raspberry Pi OS, and it s tightly coupled with firmware files like bootcode.bin, start.elf, and fixup.dat. These files are installed to /boot/firmware, which is mounted from a FAT32 partition labeled RASPIFIRM. The device tree files (*.dtb) are manually copied from /usr/lib/linux-image-*-arm64/broadcom/ into this partition. The kernel is installed via the linux-image-arm64 package, and the boot arguments are injected by modifying /boot/firmware/cmdline.txt using sed commands. Booting depends on the root partition being labeled RASPIROOT, referenced through that file. There is no bootloader like UEFI-based or U-Boot involved the Raspberry Pi firmware directly loads the kernel, which is standard for Raspberry Pi boards.
- apt: install
  packages:
    ...
    - raspi-firmware  
The boot partition contents and kernel boot setup are tightly controlled via scripting in the recipe. Limitations of Current Workflow: While this setup works, it is
  1. Proprietary and Raspberry Pi specific It relies on the closed-source GPU bootloader the raspi-firmware package, which is tightly coupled to specific Raspberry Pi models.
  2. Manual DTB handling Device tree files are manually copied and hardcoded, making upgrades or board-specific changes error-prone.
  3. Not easily extendable to future Raspberry Pi boards Any change in bootloader behavior (as seen in the Raspberry Pi 5, which introduces a more flexible firmware boot process) would require significant rework.
  4. No UEFI-based/U-Boot The current method bypasses the standard bootloader layers, making it inconsistent with other Debian ARM platforms and harder to maintain long-term.
As Raspberry Pi firmware and boot processes evolve, especially with the introduction of Pi 5 and potentially Pi 6, maintaining compatibility will require more flexibility - something best delivered by adopting U-Boot and flash-kernel.

New Workflow: Building Architecture-Specific Images with vmdb2, U-Boot, flash-kernel, and Debian Kernel This workflow outlines an improved approach to generating bootable Debian images architecture specific, using vmdb2, U-Boot, flash-kernel, and Debian kernels and also to move away from Raspberry Pi s proprietary bootloader to a fully open-source boot process which improves maintainability, consistency, and cross-board support.

New Method: Shift to U-Boot + flash-kernel U-Boot (via Debian su-boot-rpi package) and flash-kernel bring the image building process closer to how Debian officially boots ARM devices. flash-kernel integrates with the system s initramfs and kernel packages to install bootloaders, prepare boot.scr or extlinux.conf, and copy kernel/initrd/DTBs to /boot in a format that U-Boot expects. U-Boot will be used as a second-stage bootloader, loaded by the Raspberry Pi s built-in firmware. Once U-Boot is in place, it will read standard boot scripts ( boot.scr) generated by flash-kernel, providing a Debian-compatible and board-flexible solution. Extending YAML spec for vmdb2 build with U-Boot and flash-kernel To improve an existing vmdb2 YAML spec(https://salsa.debian.org/raspi-team/image-specs/raspi_master.yaml), to integrate U-Boot, flash-kernel, and the architecture-specific Debian kernel into the image build process. By incorporating u-boot-rpi and flash-kernel from Debian packages, alongside the standard initramfs-tools, we align the image closer to Debian best practices while supporting both armhf and arm64 architectures. Below are key additions and adjustments needed in a vmdb2 YAML spec to support the workflow: Install U-Boot, flash-kernel, initramfs-tools and the architecture-specific Debian kernel.
- apt: install
  packages:
    - u-boot-rpi
    - flash-kernel
    - initramfs-tools
    - linux-image-arm64 # or linux-image-armmp for armhf 
  tag: tag-root
Replace linux-image-arm64 with the correct kernel package for specific target architecture. These packages should be added under the tag-root section in YAML spec for vmdb2 build recipe. This ensures that the necessary bootloader, kernel, and initramfs tools are included and properly configured in the image. Configure Raspberry Pi firmware to Load U-Boot Install the U-Boot binary as kernel.img in /boot/firmware we can also download and build U-Boot from source, but Debian provides tested binaries.
- shell:  
    cp /usr/lib/u-boot/rpi_4/u-boot.bin $ ROOT? /boot/firmware/kernel.img
    echo "enable_uart=1" >> $ ROOT? /boot/firmware/config.txt
  root-fs: tag-root
This makes the RPi firmware load u-boot.bin instead of the Linux kernel directly. Set Up flash-kernel for Debian-style Boot flash-kernel integrates with initramfs-tools and writes boot config suitable for U-Boot. We need to make sure /etc/flash-kernel/db contains an entry for board (most Raspberry Pi boards already supported in Bookworm). Set up /etc/flash-kernel.conf with:
- create-file: /etc/flash-kernel.conf
  contents:  
    MACHINE="Raspberry Pi 4"
    BOOTPART="/dev/disk/by-label/RASPIFIRM"
    ROOTPART="/dev/disk/by-label/RASPIROOT"
  unless: rootfs_unpacked
This allows flash-kernel to write an extlinux.conf or boot.scr into /boot/firmware. Clean up Proprietary/Non-Free Firmware Bootflow Remove the direct kernel loading flow:
- shell:  
    rm -f $ ROOT? /boot/firmware/vmlinuz*
    rm -f $ ROOT? /boot/firmware/initrd.img*
    rm -f $ ROOT? /boot/firmware/cmdline.txt
  root-fs: tag-root
Let U-Boot and flash-kernel manage kernel/initrd and boot parameters instead. Boot Flow After This Change
[SoC ROM] -> [start.elf] -> [U-Boot] -> [boot.scr] -> [Linux Kernel]
  1. This still depends on the Raspberry Pi firmware to start, but it only loads U-Boot, not Linux kernel.
  2. U-Boot gives you more flexibility (e.g., networking, boot menus, signed boot).
  3. Using flash-kernel ensures kernel updates are handled the Debian Installer way.
  4. Test with a serial console (enable_uart=1) in case HDMI doesn t show early boot logs.
Advantage of New Workflow
  1. Replaces the proprietary Raspberry Pi bootloader with upstream U-Boot.
  2. Debian-native tooling Uses flash-kernel and initramfs-tools to manage boot configuration.
  3. Consistent across boards Works for both armhf and arm64, unifying the image build process.
  4. Easier to support new boards Like the Raspberry Pi 5 and future models.
This transition will standardize a bit image-building process, making it aligned with upstream Debian Installer workflows.

vmdb2 configuration for arm64 using u-boot and flash-kernel NOTE: This is a baseline example and may require tuning.
# Raspberry Pi arm64 image using U-Boot and flash-kernel
steps:
  # ... (existing mkimg, partitions, mount, debootstrap, etc.) ...
  # Install U-Boot, flash-kernel, initramfs-tools and architecture specific kernel
  - apt: install
    packages:
      - u-boot-rpi
      - flash-kernel
      - initramfs-tools
      - linux - image - arm64 # or linux - image - armmp for armhf
    tag: tag-root
  # Install U-Boot binary as kernel.img in firmware partition
  - shell:  
      cp /usr/lib/u-boot/rpi_arm64 /u-boot.bin $ ROOT? /boot/firmware/kernel.img
      echo "enable_uart=1" >> $ ROOT? /boot/firmware/config.txt
    root-fs: tag-root
  # Configure flash-kernel for Raspberry Pi
  - create-file: /etc/flash-kernel.conf
    contents:  
      MACHINE="Generic Raspberry Pi ARM64"
      BOOTPART="/dev/disk/by-label/RASPIFIRM"
      ROOTPART="/dev/disk/by-label/RASPIROOT"
    unless: rootfs_unpacked
  # Remove direct kernel boot files from Raspberry Pi firmware
  - shell:  
      rm -f $ ROOT? /boot/firmware/vmlinuz*
      rm -f $ ROOT? /boot/firmware/initrd.img*
      rm -f $ ROOT? /boot/firmware/cmdline.txt
    root-fs: tag-root
  # flash-kernel will manage boot scripts and extlinux.conf
  # Rest of image build continues...

Required Changes to Support Raspberry Pi Boards in Debian (flash-kernel + U-Boot)

Overview of Required Changes
Component Required Task
Debian U-Boot Package Add build target for rpi_arm64 in u-boot-rpi. Optionally deprecate legacy 32-bit targets.
Debian flash-kernel Package Add or verify entries in db/all.db for Pi 4, Pi 5, Zero 2W, CM4. Ensure boot script generation works via bootscr.uboot-generic.
Debian Kernel Ensure DTBs are installed at /usr/lib/linux-image-<version>/ and available for flash-kernel to reference.

flash-kernel

Already Supported Boards in flash-kernel Debian Package https://sources.debian.org/src/flash-kernel/3.109/db/all.db/#L1700
Model Arch DTB-Id
Raspberry Pi 1 A/B/B+, Rev2 armel bcm2835-*
Raspberry Pi CM1 armel bcm2835-rpi-cm1-io1.dtb
Raspberry Pi Zero/Zero W armel bcm2835-rpi-zero*.dtb
Raspberry Pi 2B armhf bcm2836-rpi-2-b.dtb
Raspberry Pi 3B/3B+ arm64 bcm2837-*
Raspberry Pi CM3 arm64 bcm2837-rpi-cm3-io3.dtb
Raspberry Pi 400 arm64 bcm2711-rpi-400.dtb

uboot

Already Supported Boards in Debian U-Boot Package https://salsa.debian.org/installer-team/flash-kernel/-/blob/master/db/all.db

arm64 Model Arch Upstream Defconfig Debian Target - - - Raspberry Pi 3B arm64 rpi_3_defconfig rpi_3 Raspberry Pi 4B arm64 rpi_4_defconfig rpi_4 Raspberry Pi 3B/3B+/CM3/CM3+/4B/CM4/400/5B/Zero 2W arm64 rpi_arm64_defconfig rpi_arm64
armhf Model Arch Upstream Defconfig Debian Target - - - Raspberry Pi 2 armhf rpi_2_defconfig rpi_2 Raspberry Pi 3B (32-bit) armhf rpi_3_32b_defconfig rpi_3_32b Raspberry Pi 4B (32-bit) armhf rpi_4_32b_defconfig rpi_4_32b
armel Model Arch Upstream Defconfig Debian Target - - - Raspberry Pi armel rpi_defconfig rpi Raspberry Pi 1/Zero armel rpi_0_w rpi_0_w These boards are already defined in debian/rules under the u-boot-rpi source package and generates usable U-Boot binaries for corresponding Raspberry Pi models.

To-Do: Add Missing Board Support to U-Boot and flash-kernel in Debian Several Raspberry Pi models are missing from the Debian U-Boot and flash-kernel packages, even though upstream support and DTBs exist in the Debian kernel but are missing entries in the flash-kernel database to enable support for bootloader installation and initrd handling.

Boards Not Yet Supported in flash-kernel Debian Package
Model Arch DTB-Id
Raspberry Pi 3A+ (32 & 64 bit) armhf, arm64 bcm2837-rpi-3-a-plus.dtb
Raspberry Pi 4B (32 & 64 bit) armhf, arm64 bcm2711-rpi-4-b.dtb
Raspberry Pi CM4 arm64 bcm2711-rpi-cm4-io.dtb
Raspberry Pi CM 4S arm64 -
Raspberry Zero 2 W arm64 bcm2710-rpi-zero-2-w.dtb
Raspberry Pi 5 arm64 bcm2712-rpi-5-b.dtb
Raspberry Pi CM5 arm64 -
Raspberry Pi 500 arm64 -

Boards Not Yet Supported in Debian U-Boot Package
Model Arch Upstream defconfig(s)
Raspberry Pi 3A+/3B+ arm64 -, rpi_3_b_plus_defconfig
Raspberry Pi CM 4S arm64 -
Raspberry Pi 5 arm64 -
Raspberry Pi CM5 arm64 -
Raspberry Pi 500 arm64 -

So, what next? During the Community Bonding Period, I got hands-on with workflow improvements, set up test environments, and began reviewing Raspberry Pi support in Debian s U-Boot and flash-kernel and these are the logs of the project, where I provide weekly reports on the work done. You can check here: Community Bonding Period logs. My next steps include submitting patches to the u-boot and flash-kernel packages to ensure all missing Raspberry Pi entries are built and shipped. And, also to confirm the kernel DTB installation paths and make sure the necessary files are included for all Raspberry Pi variants. Finally, plan to validate changes with test builds on Raspberry Pi hardware. In parallel, I m organizing my tasks and setting up my environment to contribute more effectively. It s been exciting to explore how things work under the hood and to prepare for a summer of learning and contributing to this great community.

11 June 2025

Gunnar Wolf: Understanding Misunderstandings - Evaluating LLMs on Networking Questions

This post is a review for Computing Reviews for Understanding Misunderstandings - Evaluating LLMs on Networking Questions , a article published in Association for Computing Machinery (ACM), SIGCOMM Computer Communication Review
Large language models (LLMs) have awed the world, emerging as the fastest-growing application of all time ChatGPT reached 100 million active users in January 2023, just two months after its launch. After an initial cycle, they have gradually been mostly accepted and incorporated into various workflows, and their basic mechanics are no longer beyond the understanding of people with moderate computer literacy. Now, given that the technology is better understood, we face the question of how convenient LLM chatbots are for different occupations. This paper embarks on the question of whether LLMs can be useful for networking applications. This paper systematizes querying three popular LLMs (GPT-3.5, GPT-4, and Claude 3) with questions taken from several network management online courses and certifications, and presents a taxonomy of six axes along which the incorrect responses were classified: The authors also measure four strategies toward improving answers: The authors observe that, while some of those strategies were marginally useful, they sometimes resulted in degraded performance. The authors queried the commercially available instances of Gemini and GPT, which achieved scores over 90 percent for basic subjects but fared notably worse in topics that require understanding and converting between different numeric notations, such as working with Internet protocol (IP) addresses, even if they are trivial (that is, presenting the subnet mask for a given network address expressed as the typical IPv4 dotted-quad representation). As a last item in the paper, the authors compare performance with three popular open-source models: Llama3.1, Gemma2, and Mistral with their default settings. Although those models are almost 20 times smaller than the GPT-3.5 commercial model used, they reached comparable performance levels. Sadly, the paper does not delve deeper into these models, which can be deployed locally and adapted to specific scenarios. The paper is easy to read and does not require deep mathematical or AI-related knowledge. It presents a clear comparison along the described axes for the 503 multiple-choice questions presented. This paper can be used as a guide for structuring similar studies over different fields.

4 June 2025

Gunnar Wolf: The subjective value of privacy Assessing individuals' calculus of costs and benefits in the context of state surveillance

This post is an unpublished review for The subjective value of privacy Assessing individuals' calculus of costs and benefits in the context of state surveillance
Internet users, software developers, academics, entrepreneurs basically everybody is now aware of the importance of considering privacy as a core part of our online experience. User demand, and various national or regional laws, have made privacy a continuously present subject. And privacy is such an all-encompassing, complex topic, the angles from which it can be studied seems never to finish; I recommend computer networking-oriented newcomers to the topic to refer to Brian Kernighan s excellent work [1]. However, how do regular people like ourselves, in our many capacities feel about privacy? Lukas Antoine presents a series of experiments aiming at better understanding how people throughout the world understands privacy, and when is privacy held as more or less important than security in different aspects, Particularly, privacy is often portrayed as a value set at tension against surveillance, and particularly state surveillance, in the name of security: conventional wisdom presents the idea of privacy calculus. This is, it is often assumed that individuals continuously evaluate the costs and benefits of divulging their personal data, sharing data when they expect a positive net outcome, and denying it otherwise. This framework has been accepted for decades, and the author wishes to challenge it. This book is clearly his doctoral thesis on political sciences, and its contents are as thorough as expected in this kind of product. The author presents three empirical studies based on cross-survey analysis. The first experiment explores the security justifications for surveillance and how they influence their support. The second one searches whether the stance on surveillance can be made dependent on personal convenience or financial cost. The third study explores whether privacy attitude is context-dependant or can be seen as a stable personality trait. The studies aim to address the shortcomings of published literature in the field, mainly, (a) the lack of comprehensive research on state surveillance, needed or better understanding privacy appreciation, (b) while several studies have tackled the subjective measure of privacy, there is a lack of cross-national studies to explain wide-ranging phenomena, (c) most studies in this regard are based on population-based surveys, which cannot establish causal relationships, (d) a seemingly blind acceptance of the privacy calculus mentioned above, with no strong evidence that it accurately measures people s motivations for disclosing or withholding their data. The specific take, including the framing of the tension between privacy and surveillance has long been studied, as can be seen in Steven Nock s 1993 book [2], but as Sannon s article in 2022 shows [3], social and technological realities require our undertanding to be continuously kept up to date. The book is full with theoretical references and does a very good job of explaining the path followed by the author. It is, though, a heavy read, and, for people not coming from the social sciences tradition, leads to the occasional feeling of being lost. The conceptual and theoretical frameworks and presented studies are thorough and clear. The author is honest in explaining when the data points at some of his hypotheses being disproven, while others are confirmed. The aim of the book is for people digging deep into this topic. Personally, I have authored several works on different aspects of privacy (such as a book [4] and a magazine number [5]), but this book did get me thinking on many issues I had not previously considered. Looking for comparable works, I find Friedewald et al. s 2017 book [6] chapter organization to follow a similar thought line. My only complaint would be that, for the publication as part of its highly prestigious publisher, little attention has been paid to editorial aspects: sub-subsection depth is often excessive and unclear. Also, when publishing monographs based on doctoral works, it is customary to no longer refer to the work as a thesis and to soften some of the formal requirements such a work often has, with the aim of producing a more gentle and readable book; this book seems just like the mass-production of an (otherwise very interesting and well made) thesis work. References:

Gunnar Wolf: Humanities and big data in Ibero-America Theory, methodology and practical applications

This post is an unpublished review for Humanities and big data in Ibero-America Theory, methodology and practical applications
Digital humanities is a young though established field. It deals with different expressions in which digital data manipulation techniques can be applied and used to analyze subjects that are identified as belonging to the humanities. Although most often used to analyze different aspects of literature or social network analysis, it can also be applied to other humanistic disciplines or artistic expressions. Digital humanities employs many tools, but those categorized as big data are among the most frequently employed. This book samples different takes on digital humanities, with the particularity that it focuses on Ibero-American uses. It is worth noting that this book is the second in a series of four volumes, published or set to be published between 2022 and 2026. Being the output of a field survey, I perceive this book to be targeted towards fellow Digital Humanists people interested in applying computational methods to further understand and research topics in the humanities. It is not a technical book in the sense Computer Science people would recognize as such, but several of the presented works do benefit from understanding some technical concepts. The 12 articles (plus an introduction) that make up this book are organized in three parts: (1) Theoretical Framework presents the ideas and techniques of data science (that make up the tools for handling big data), and explores how data science can contribute to literary analysis, all while noting that many such techniques are usually frowned upon in Latin America as data science smells neoliberal ; (2) Methodological Issues looks at specific issues through the lens of how they can be applied to big data, with specific attention given to works in Spanish; and (3) Practical Applications analyzes specific Spanish works and communities based on big data techniques. Several chapters treat a recurring theme: the simultaneous resistance and appropriation of big data by humanists. For example, at least three of the chapters describe the tensions between humanism ( aesthesis ) and cold, number-oriented data analysis ( mathesis ). The analyzed works of Parts 2 and 3 are interesting and relatively easy to follow. Some inescapable ideological gleans from several word uses from the book s and series name, which refers to the Spanish-speaking regions as Ibero-America , often seen as Eurocentric, in contrast with the Latin America term much more widely used throughout the region. I will end with some notes about the specific versions of the book I reviewed. I read both an EPUB version and a print copy. The EPUB did not include links for easy navigation to footnotes, that is, the typographical superindexes are not hyperlinked to the location of the notes, so it is very impractical to try to follow them. The print version (unlike the EPUB) did not have an index, that is, the six pages before the introduction are missing from the print copy I received. For a book such as this one, not having an index hampers the ease of reading and referencing.

Gunnar Wolf: Beyond data poisoning in federated learning

This post is an unpublished review for Beyond data poisoning in federated learning
The current boom of artificial intelligence (AI) is based upon neural networks (NNs). In order for these to be useful, the network has to undergo a machine learning (ML) process: work over a series of inputs, and adjust the inner weights of the connections between neurons so that each of the data samples the network was trained on produces the right set of labels for each item. Federated learning (FL) appeared as a reaction given the data centralization power that traditional ML provides: instead of centrally controlling the whole training data, various different actors analyze disjoint subsets of data, and provide only the results of this analysis, thus increasing privacy while analyzing a large dataset. Finally, given multiple actors are involved in FL, how hard is it for a hostile actor to provide data that will confuse the NN, instead of helping it reach better performance? This kind of attack is termed a poisoning attack, and is the main focus of this paper. The authors set out to research how effective can a hyperdimensional data poisoning attack (HDPA) be to confuse a NN and cause it to misclassify both the items trained on and yet unseen items. Data used for NN training is usually represented as a large set of orthogonal vectors, each describing a different aspect of the item, allowing for very simple vector arithmetic operations. Thus, NN training is termed as high-dimensional or hyperdimensional. The attack method described by the authors employs cosine similarity, that is, in order to preserve similarity, a target hypervector is reflected over a given dimension, yielding a cosine-similar result that will trick ML models, even if using byzantine-robust defenses. The paper is clear, though not an easy read. It explains in detail the mathematical operations, following several related although different threat models. The authors present the results of the experimental evaluation of their proposed model, comparing it to several other well-known adversarial attacks for visual recognition tasks, over pre-labeled datasets frequently used as training data, such as MNIST, Fashion-MNIST and CIFAR-10. They show that their method is not only more effective as an attack, but falls within the same time range as other surveyed attacks. Adversarial attacks are, all in all, an important way to advance any field of knowledge; by publishing this attack, the authors will surely spark other works to detect and prevent this kind of alteration. It is important for AI implementers to understand the nature of this field and be aware of the risks that this work, as well as others cited in it, highlight: ML will train a computer system to recognize a dataset, warts and all; efficient as AI is, if noise is allowed into the training data (particularly adversarially generated noise), the trained model might present impaired performance.

Gunnar Wolf: Computational modelling of robot personhood and relationality

This post is an unpublished review for Computational modelling of robot personhood and relationality
If humans and robots were to be able to roam around the same spaces, mutually recognizing each other for what they are, how would interaction be? How can we model such interactions in a way that we can reason about and understand the implications of a given behavior? This book aims at answering this question. The book is split into two very different parts. Chapters 1 through 3 are mostly written with a philosophical angle. It starts by framing the possibility of having sentient androids exist in the same plane as humans, without them trying to pass as us or vice versa. The first chapters look at issues related to personhood, that is, how androids can be treated as valid interaction partners in a society with humans, and how interactions with them can be seen as meaningful. In doing so, several landmarks of the past 40 years in the AI field are reviewed. The issues of the Significant Concerns that make up a society and give it coherence and of Personhood and Relationality , describing how this permeates from a society into each of the individuals that make it up, the relations between them and the social objects that bring individuals closer together (or farther apart) are introduced and explained. The second part of the book is written from a very different angle, and the change in pace took me somewhat by surprise. Each subsequent chapter presents a different angle of the Affinity system, a model that follows some aspects of human behavior over time and in a given space. Chapter 4 introduces the Affinity environment: a 3D simulated environment with simulated physical laws and characteristics, where a number of agents (30-50 is mentioned as usual) interact. Agents have a series of attributes ( value memory ), can adhere to different programs ( narratives ), and gain or lose on some vectors ( economy ). They can sense the world around them with sensors, and can modify the world or signal other agents using effectors. The last two chapters round out the book, as expected: the first presents a set of results from analyzing a given set of value systems, and the second gives readers the conclusions reached by the author. However, I was expecting more either having at least a link to download the Affinity system and continue exploring it or modifying some of the aspects it models to get it to model a set of agents with different stories and narratives, or extend it to yet unforseen behaviors, or at least have the author present a more complete comparison of results than the evaluation of patterns resulting from a given run. The author is a well-known, prolific author in the field, and I was expecting bigger insights from this book. Nevertheless, the book is an interesting and fun read, with important insights in both the first and second parts. There is a certain lack of connection between their respective rhythms, and the second part indeed builds on the concepts introduced in the first one. Overall, I enjoyed reading the book despite expecting more.

Gunnar Wolf: Will be adding yet-to-be-published reviews

Since December 2023, I have been publishing the reviews I write for Computing Reviews as they get published. I will do a slight change now: I will start pushing the reviews to my blog as I write them, and of course, will modify them with the final wording and to link to their place as soon as they are published. I m doing this because sometimes it takes very long for reviews to be approved, and I want to share them with my blog s readers! So, please bear with this a bit: I ll send a (short!) flood of my latest four pending reviews today. Some books I have recently read

23 May 2025

Gunnar Wolf: No further discussion -- I am staying with a Thinkpad keyboard.

I have been a very happy user of my two SK-8845 keyboards (one at my office, one at home) since I bought them, in 2018 and 2021 respectively. What are they, mind you? SK-8845 keyboard) The beautiful keyboard every Thinkpad owner knows and loves. And although I no longer use my X230 laptop that was my workhorse for several years, my fingers are spoiled. So, both shift keys of my home keyboard have been getting flaky, and I am basically sure it s a failure in the controller, as it does not feel to be physical. It s time to revisit that seven year old post where I found the SK-8845. This time, I decided to try my luck with something different. As a Emacs user, everybody knows we ought to be happy with more and more keys. In fact, I suppose many international people are now familiar with El Eternauta, true? we Emacs users would be the natural ambassadors to deal with the hand species: Emacs users from outer space! So it kind-of sort-of made sense, when I saw a Toshiba-IBM keyboard being sold for quite cheap (MX$400, just over US$20) to try my luck with it: A truly POS keyboard This is quite an odd piece of hardware, built in 2013 according to its label. At first I was unsure whether to buy it because of the weird interface it had, but the vendor replied they would ship a (very long!) USB cable with it, so A weird port inside the keyboard And a matching weird connector As expected, connecting it to Linux led to a swift, errorless recognition: Nothing too odd here Within minutes of receiving the hardware, I had it hooked up and started looking at the events it generated However the romance soon started to wane. Some of the reasons: Anyway I m returning it I found an SK-8845 for sale in China for just MX$1814 (~US$90), and jumped for it They are getting scarce! Nowadays it s getting more common (and cheaper) to find the newer style Thinkpad keyboards, but without a trackpad I don t think I should stockpile on keyboards, but no, I m not doing that Anyway, so I m sticking to a Thinkpad keyboard, third in a row.

Next.