Search Results: "edd"

18 April 2024

Thomas Koch: Good things come ... state folder

Posted on January 2, 2024
Just a little while ago (10 years) I proposed the addition of a state folder to the XDG basedir specification and expanded the article XDGBaseDirectorySpecification in the Debian wiki. Recently I learned, that version 0.8 (from May 2021) of the spec finally includes a state folder. Granted, I wasn t the first to have this idea (2009), nor the one who actually made it happen. Now, please go ahead and use it! Thank you.

17 April 2024

Dirk Eddelbuettel: RcppArmadillo 0.12.8.2.1 on CRAN: Micro Fix

armadillo image Armadillo is a powerful and expressive C++ template library for linear algebra and scientific computing. It aims towards a good balance between speed and ease of use, has a syntax deliberately close to Matlab, and is useful for algorithm development directly in C++, or quick conversion of research code into production environments. RcppArmadillo integrates this library with the R environment and language and is widely used by (currently) 1135 other packages on CRAN, downloaded 33.7 million times (per the partial logs from the cloud mirrors of CRAN), and the CSDA paper (preprint / vignette) by Conrad and myself has been cited 579 times according to Google Scholar. Yesterday s release accommodates reticulate by suspending a single test that now croaks creating a reverse-dependency issue for that package. No other changes were made. The set of changes since the last CRAN release follows.

Changes in RcppArmadillo version 0.12.8.2.1 (2024-04-15)
  • One-char bug fix release commenting out one test that upsets reticulate when accessing a scipy sparse matrix

Courtesy of my CRANberries, there is a diffstat report relative to previous release. More detailed information is on the RcppArmadillo page. Questions, comments etc should go to the rcpp-devel mailing list off the Rcpp R-Forge page. If you like this or other open-source work I do, you can sponsor me at GitHub.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

11 April 2024

Reproducible Builds: Reproducible Builds in March 2024

Welcome to the March 2024 report from the Reproducible Builds project! In our reports, we attempt to outline what we have been up to over the past month, as well as mentioning some of the important things happening more generally in software supply-chain security. As ever, if you are interested in contributing to the project, please visit our Contribute page on our website. Table of contents:
  1. Arch Linux minimal container userland now 100% reproducible
  2. Validating Debian s build infrastructure after the XZ backdoor
  3. Making Fedora Linux (more) reproducible
  4. Increasing Trust in the Open Source Supply Chain with Reproducible Builds and Functional Package Management
  5. Software and source code identification with GNU Guix and reproducible builds
  6. Two new Rust-based tools for post-processing determinism
  7. Distribution work
  8. Mailing list highlights
  9. Website updates
  10. Delta chat clients now reproducible
  11. diffoscope updates
  12. Upstream patches
  13. Reproducibility testing framework

Arch Linux minimal container userland now 100% reproducible In remarkable news, Reproducible builds developer kpcyrd reported that that the Arch Linux minimal container userland is now 100% reproducible after work by developers dvzv and Foxboron on the one remaining package. This represents a real world , widely-used Linux distribution being reproducible. Their post, which kpcyrd suffixed with the question now what? , continues on to outline some potential next steps, including validating whether the container image itself could be reproduced bit-for-bit. The post, which was itself a followup for an Arch Linux update earlier in the month, generated a significant number of replies.

Validating Debian s build infrastructure after the XZ backdoor From our mailing list this month, Vagrant Cascadian wrote about being asked about trying to perform concrete reproducibility checks for recent Debian security updates, in an attempt to gain some confidence about Debian s build infrastructure given that they performed builds in environments running the high-profile XZ vulnerability. Vagrant reports (with some caveats):
So far, I have not found any reproducibility issues; everything I tested I was able to get to build bit-for-bit identical with what is in the Debian archive.
That is to say, reproducibility testing permitted Vagrant and Debian to claim with some confidence that builds performed when this vulnerable version of XZ was installed were not interfered with.

Making Fedora Linux (more) reproducible In March, Davide Cavalca gave a talk at the 2024 Southern California Linux Expo (aka SCALE 21x) about the ongoing effort to make the Fedora Linux distribution reproducible. Documented in more detail on Fedora s website, the talk touched on topics such as the specifics of implementing reproducible builds in Fedora, the challenges encountered, the current status and what s coming next. (YouTube video)

Increasing Trust in the Open Source Supply Chain with Reproducible Builds and Functional Package Management Julien Malka published a brief but interesting paper in the HAL open archive on Increasing Trust in the Open Source Supply Chain with Reproducible Builds and Functional Package Management:
Functional package managers (FPMs) and reproducible builds (R-B) are technologies and methodologies that are conceptually very different from the traditional software deployment model, and that have promising properties for software supply chain security. This thesis aims to evaluate the impact of FPMs and R-B on the security of the software supply chain and propose improvements to the FPM model to further improve trust in the open source supply chain. PDF
Julien s paper poses a number of research questions on how the model of distributions such as GNU Guix and NixOS can be leveraged to further improve the safety of the software supply chain , etc.

Software and source code identification with GNU Guix and reproducible builds In a long line of commendably detailed blog posts, Ludovic Court s, Maxim Cournoyer, Jan Nieuwenhuizen and Simon Tournier have together published two interesting posts on the GNU Guix blog this month. In early March, Ludovic Court s, Maxim Cournoyer, Jan Nieuwenhuizen and Simon Tournier wrote about software and source code identification and how that might be performed using Guix, rhetorically posing the questions: What does it take to identify software ? How can we tell what software is running on a machine to determine, for example, what security vulnerabilities might affect it? Later in the month, Ludovic Court s wrote a solo post describing adventures on the quest for long-term reproducible deployment. Ludovic s post touches on GNU Guix s aim to support time travel , the ability to reliably (and reproducibly) revert to an earlier point in time, employing the iconic image of Harold Lloyd hanging off the clock in Safety Last! (1925) to poetically illustrate both the slapstick nature of current modern technology and the gymnastics required to navigate hazards of our own making.

Two new Rust-based tools for post-processing determinism Zbigniew J drzejewski-Szmek announced add-determinism, a work-in-progress reimplementation of the Reproducible Builds project s own strip-nondeterminism tool in the Rust programming language, intended to be used as a post-processor in RPM-based distributions such as Fedora In addition, Yossi Kreinin published a blog post titled refix: fast, debuggable, reproducible builds that describes a tool that post-processes binaries in such a way that they are still debuggable with gdb, etc.. Yossi post details the motivation and techniques behind the (fast) performance of the tool.

Distribution work In Debian this month, since the testing framework no longer varies the build path, James Addison performed a bulk downgrade of the bug severity for issues filed with a level of normal to a new level of wishlist. In addition, 28 reviews of Debian packages were added, 38 were updated and 23 were removed this month adding to ever-growing knowledge about identified issues. As part of this effort, a number of issue types were updated, including Chris Lamb adding a new ocaml_include_directories toolchain issue [ ] and James Addison adding a new filesystem_order_in_java_jar_manifest_mf_include_resource issue [ ] and updating the random_uuid_in_notebooks_generated_by_nbsphinx to reference a relevant discussion thread [ ]. In addition, Roland Clobus posted his 24th status update of reproducible Debian ISO images. Roland highlights that the images for Debian unstable often cannot be generated due to changes in that distribution related to the 64-bit time_t transition. Lastly, Bernhard M. Wiedemann posted another monthly update for his reproducibility work in openSUSE.

Mailing list highlights Elsewhere on our mailing list this month:

Website updates There were made a number of improvements to our website this month, including:
  • Pol Dellaiera noticed the frequent need to correctly cite the website itself in academic work. To facilitate easier citation across multiple formats, Pol contributed a Citation File Format (CIF) file. As a result, an export in BibTeX format is now available in the Academic Publications section. Pol encourages community contributions to further refine the CITATION.cff file. Pol also added an substantial new section to the buy in page documenting the role of Software Bill of Materials (SBOMs) and ephemeral development environments. [ ][ ]
  • Bernhard M. Wiedemann added a new commandments page to the documentation [ ][ ] and fixed some incorrect YAML elsewhere on the site [ ].
  • Chris Lamb add three recent academic papers to the publications page of the website. [ ]
  • Mattia Rizzolo and Holger Levsen collaborated to add Infomaniak as a sponsor of amd64 virtual machines. [ ][ ][ ]
  • Roland Clobus updated the stable outputs page, dropping version numbers from Python documentation pages [ ] and noting that Python s set data structure is also affected by the PYTHONHASHSEED functionality. [ ]

Delta chat clients now reproducible Delta Chat, an open source messaging application that can work over email, announced this month that the Rust-based core library underlying Delta chat application is now reproducible.

diffoscope diffoscope is our in-depth and content-aware diff utility that can locate and diagnose reproducibility issues. This month, Chris Lamb made a number of changes such as uploading versions 259, 260 and 261 to Debian and made the following additional changes:
  • New features:
    • Add support for the zipdetails tool from the Perl distribution. Thanks to Fay Stegerman and Larry Doolittle et al. for the pointer and thread about this tool. [ ]
  • Bug fixes:
    • Don t identify Redis database dumps as GNU R database files based simply on their filename. [ ]
    • Add a missing call to File.recognizes so we actually perform the filename check for GNU R data files. [ ]
    • Don t crash if we encounter an .rdb file without an equivalent .rdx file. (#1066991)
    • Correctly check for 7z being available and not lz4 when testing 7z. [ ]
    • Prevent a traceback when comparing a contentful .pyc file with an empty one. [ ]
  • Testsuite improvements:
    • Fix .epub tests after supporting the new zipdetails tool. [ ]
    • Don t use parenthesis within test skipping messages, as PyTest adds its own parenthesis. [ ]
    • Factor out Python version checking in test_zip.py. [ ]
    • Skip some Zip-related tests under Python 3.10.14, as a potential regression may have been backported to the 3.10.x series. [ ]
    • Actually test 7z support in the test_7z set of tests, not the lz4 functionality. (Closes: reproducible-builds/diffoscope#359). [ ]
In addition, Fay Stegerman updated diffoscope s monkey patch for supporting the unusual Mozilla ZIP file format after Python s zipfile module changed to detect potentially insecure overlapping entries within .zip files. (#362) Chris Lamb also updated the trydiffoscope command line client, dropping a build-dependency on the deprecated python3-distutils package to fix Debian bug #1065988 [ ], taking a moment to also refresh the packaging to the latest Debian standards [ ]. Finally, Vagrant Cascadian submitted an update for diffoscope version 260 in GNU Guix. [ ]

Upstream patches This month, we wrote a large number of patches, including: Bernhard M. Wiedemann used reproducibility-tooling to detect and fix packages that added changes in their %check section, thus failing when built with the --no-checks option. Only half of all openSUSE packages were tested so far, but a large number of bugs were filed, including ones against caddy, exiv2, gnome-disk-utility, grisbi, gsl, itinerary, kosmindoormap, libQuotient, med-tools, plasma6-disks, pspp, python-pypuppetdb, python-urlextract, rsync, vagrant-libvirt and xsimd. Similarly, Jean-Pierre De Jesus DIAZ employed reproducible builds techniques in order to test a proposed refactor of the ath9k-htc-firmware package. As the change produced bit-for-bit identical binaries to the previously shipped pre-built binaries:
I don t have the hardware to test this firmware, but the build produces the same hashes for the firmware so it s safe to say that the firmware should keep working.

Reproducibility testing framework The Reproducible Builds project operates a comprehensive testing framework running primarily at tests.reproducible-builds.org in order to check packages and other artifacts for reproducibility. In March, an enormous number of changes were made by Holger Levsen:
  • Debian-related changes:
    • Sleep less after a so-called 404 package state has occurred. [ ]
    • Schedule package builds more often. [ ][ ]
    • Regenerate all our HTML indexes every hour, but only every 12h for the released suites. [ ]
    • Create and update unstable and experimental base systems on armhf again. [ ][ ]
    • Don t reschedule so many depwait packages due to the current size of the i386 architecture queue. [ ]
    • Redefine our scheduling thresholds and amounts. [ ]
    • Schedule untested packages with a higher priority, otherwise slow architectures cannot keep up with the experimental distribution growing. [ ]
    • Only create the stats_buildinfo.png graph once per day. [ ][ ]
    • Reproducible Debian dashboard: refactoring, update several more static stats only every 12h. [ ]
    • Document how to use systemctl with new systemd-based services. [ ]
    • Temporarily disable armhf and i386 continuous integration tests in order to get some stability back. [ ]
    • Use the deb.debian.org CDN everywhere. [ ]
    • Remove the rsyslog logging facility on bookworm systems. [ ]
    • Add zst to the list of packages which are false-positive diskspace issues. [ ]
    • Detect failures to bootstrap Debian base systems. [ ]
  • Arch Linux-related changes:
    • Temporarily disable builds because the pacman package manager is broken. [ ][ ]
    • Split reproducible_html_live_status and split the scheduling timing . [ ][ ][ ]
    • Improve handling when database is locked. [ ][ ]
  • Misc changes:
    • Show failed services that require manual cleanup. [ ][ ]
    • Integrate two new Infomaniak nodes. [ ][ ][ ][ ]
    • Improve IRC notifications for artifacts. [ ]
    • Run diffoscope in different systemd slices. [ ]
    • Run the node health check more often, as it can now repair some issues. [ ][ ]
    • Also include the string Bot in the userAgent for Git. (Re: #929013). [ ]
    • Document increased tmpfs size on our OUSL nodes. [ ]
    • Disable memory account for the reproducible_build service. [ ][ ]
    • Allow 10 times as many open files for the Jenkins service. [ ]
    • Set OOMPolicy=continue and OOMScoreAdjust=-1000 for both the Jenkins and the reproducible_build service. [ ]
Mattia Rizzolo also made the following changes:
  • Debian-related changes:
    • Define a systemd slice to group all relevant services. [ ][ ]
    • Add a bunch of quotes in scripts to assuage the shellcheck tool. [ ]
    • Add stats on how many packages have been built today so far. [ ]
    • Instruct systemd-run to handle diffoscope s exit codes specially. [ ]
    • Prefer the pgrep tool over grepping the output of ps. [ ]
    • Re-enable a couple of i386 and armhf architecture builders. [ ][ ]
    • Fix some stylistic issues flagged by the Python flake8 tool. [ ]
    • Cease scheduling Debian unstable and experimental on the armhf architecture due to the time_t transition. [ ]
    • Start a few more i386 & armhf workers. [ ][ ][ ]
    • Temporarly skip pbuilder updates in the unstable distribution, but only on the armhf architecture. [ ]
  • Other changes:
    • Perform some large-scale refactoring on how the systemd service operates. [ ][ ]
    • Move the list of workers into a separate file so it s accessible to a number of scripts. [ ]
    • Refactor the powercycle_x86_nodes.py script to use the new IONOS API and its new Python bindings. [ ]
    • Also fix nph-logwatch after the worker changes. [ ]
    • Do not install the stunnel tool anymore, it shouldn t be needed by anything anymore. [ ]
    • Move temporary directories related to Arch Linux into a single directory for clarity. [ ]
    • Update the arm64 architecture host keys. [ ]
    • Use a common Postfix configuration. [ ]
The following changes were also made by:
  • Jan-Benedict Glaw:
    • Initial work to clean up a messy NetBSD-related script. [ ][ ]
  • Roland Clobus:
    • Show the installer log if the installer fails to build. [ ]
    • Avoid the minus character (i.e. -) in a variable in order to allow for tags in openQA. [ ]
    • Update the schedule of Debian live image builds. [ ]
  • Vagrant Cascadian:
    • Maintenance on the virt* nodes is completed so bring them back online. [ ]
    • Use the fully qualified domain name in configuration. [ ]
Node maintenance was also performed by Holger Levsen, Mattia Rizzolo [ ][ ] and Vagrant Cascadian [ ][ ][ ][ ]

If you are interested in contributing to the Reproducible Builds project, please visit our Contribute page on our website. However, you can get in touch with us via:

8 April 2024

Bastian Blank: Python dataclasses for Deb822 format

Python includes some helping support for classes that are designed to just hold some data and not much more: Data Classes. It uses plain Python type definitions to specify what you can have and some further information for every field. This will then generate you some useful methods, like __init__ and __repr__, but on request also more. But given that those type definitions are available to other code, a lot more can be done. There exists several separate packages to work on data classes. For example you can have data validation from JSON with dacite. But Debian likes a pretty strange format usually called Deb822, which is in fact derived from the RFC 822 format of e-mail messages. Those files includes single messages with a well known format. So I'd like to introduce some Deb822 format support for Python Data Classes. For now the code resides in the Debian Cloud tool. Usage Setup It uses the standard data classes support and several helper functions. Also you need to enable support for postponed evaluation of annotations.
from __future__ import annotations
from dataclasses import dataclass
from dataclasses_deb822 import read_deb822, field_deb822
Class definition start Data classes are just normal classes, just with a decorator.
@dataclass
class Package:
Field definitions You need to specify the exact key to be used for this field.
    package: str = field_deb822('Package')
    version: str = field_deb822('Version')
    arch: str = field_deb822('Architecture')
Default values are also supported.
    multi_arch: Optional[str] = field_deb822(
        'Multi-Arch',
        default=None,
    )
Reading files
for p in read_deb822(Package, sys.stdin, ignore_unknown=True):
    print(p)
Full example
from __future__ import annotations
from dataclasses import dataclass
from debian_cloud_images.utils.dataclasses_deb822 import read_deb822, field_deb822
from typing import Optional
import sys
@dataclass
class Package:
    package: str = field_deb822('Package')
    version: str = field_deb822('Version')
    arch: str = field_deb822('Architecture')
    multi_arch: Optional[str] = field_deb822(
        'Multi-Arch',
        default=None,
    )
for p in read_deb822(Package, sys.stdin, ignore_unknown=True):
    print(p)
Known limitations

5 April 2024

Dirk Eddelbuettel: RcppArmadillo 0.12.8.2.0 on CRAN: Upstream Fix

armadillo image Armadillo is a powerful and expressive C++ template library for linear algebra and scientific computing. It aims towards a good balance between speed and ease of use, has a syntax deliberately close to Matlab, and is useful for algorithm development directly in C++, or quick conversion of research code into production environments. RcppArmadillo integrates this library with the R environment and language and is widely used by (currently) 1136 other packages on CRAN, downloaded 33.5 million times (per the partial logs from the cloud mirrors of CRAN), and the CSDA paper (preprint / vignette) by Conrad and myself has been cited 578 times according to Google Scholar. This release brings a new upstream bugfix release Armadillo 12.8.2 prepared by Conrad two days ago. It took the usual day to noodle over 1100+ reverse dependencies and ensure two failures were independent of the upgrade (i.e., no change to worse in CRAN parlance). It took CRAN another because we hit a random network outage for (spurious) NOTE on a remote URL, and were then caught in the shrapnel from another large package ecosystem update spuriously pointing some build failures that were due to a missing rebuild to us. All good, as human intervention comes to the rescue. The set of changes since the last CRAN release follows.

Changes in RcppArmadillo version 0.12.8.2.0 (2024-04-02)
  • Upgraded to Armadillo release 12.8.2 (Cortisol Injector)
    • Workaround for FFTW3 header clash
    • Workaround in testing framework for issue under macOS
    • Minor cleanups to reduce code bloat
    • Improved documentation

Courtesy of my CRANberries, there is a diffstat report relative to previous release. More detailed information is on the RcppArmadillo page. Questions, comments etc should go to the rcpp-devel mailing list off the Rcpp R-Forge page. If you like this or other open-source work I do, you can sponsor me at GitHub.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

Bits from Debian: apt install dpl-candidate: Andreas Tille

The Debian Project Developers will shortly vote for a new Debian Project Leader known as the DPL. The Project Leader is the official representative of The Debian Project tasked with managing the overall project, its vision, direction, and finances. The DPL is also responsible for the selection of Delegates, defining areas of responsibility within the project, the coordination of Developers, and making decisions required for the project. Our outgoing and present DPL Jonathan Carter served 4 terms, from 2020 through 2024. Jonathan shared his last Bits from the DPL post to Debian recently and his hopes for the future of Debian. Recently, we sat with the two present candidates for the DPL position asking questions to find out who they really are in a series of interviews about their platforms, visions for Debian, lives, and even their favorite text editors. The interviews were conducted by disaster2life (Yashraj Moghe) and made available from video and audio transcriptions: Voting for the position starts on April 6, 2024. Editors' note: This is our official return to Debian interviews, readers should stay tuned for more upcoming interviews with Developers and other important figures in Debian as part of our "Meet your Debian Developer" series. We used the following tools and services: Turboscribe.ai for the transcription from the audio and video files, IRC: Oftc.net for communication, Jitsi meet for interviews, and Open Broadcaster Software (OBS) for editing and video. While we encountered many technical difficulties in the return to this process, we are still able and proud to present the transcripts of the interviews edited only in a few areas for readability. 2024 Debian Project Leader Candidate: Andrea Tille Andreas' Interview Who are you? Tell us a little about yourself. [Andreas]:
How am I? Well, I'm, as I wrote in my platform, I'm a proud grandfather doing a lot of free software stuff, doing a lot of sports, have some goals in mind which I like to do and hopefully for the best of Debian.
And How are you today? [Andreas]:
How I'm doing today? Well, actually I have some headaches but it's fine for the interview. So, usually I feel very good. Spring was coming here and today it's raining and I plan to do a bicycle tour tomorrow and hope that I do not get really sick but yeah, for the interview it's fine.
What do you do in Debian? Could you mention your story here? [Andreas]:
Yeah, well, I started with Debian kind of an accident because I wanted to have some package salvaged which is called WordNet. It's a monolingual dictionary and I did not really plan to do more than maybe 10 packages or so. I had some kind of training with xTeddy which is totally unimportant, a cute teddy you can put on your desktop. So, and then well, more or less I thought how can I make Debian attractive for my employer which is a medical institute and so on. It could make sense to package bioinformatics and medicine software and it somehow evolved in a direction I did neither expect it nor wanted to do, that I'm currently the most busy uploader in Debian, created several teams around it. DebianMate is very well known from me. I created the Blends team to create teams and techniques around what we are doing which was Debian TIS, Debian Edu, Debian Science and so on and I also created the packaging team for R, for the statistics package R which is technically based and not topic based. All these blends are covering a certain topic and R is just needed by lots of these blends. So, yeah, and to cope with all this I have written a script which is routing an update to manage all these uploads more or less automatically. So, I think I had one day where I uploaded 21 new packages but it's just automatically generated, right? So, it's on one day more than I ever planned to do.
What is the first thing you think of when you think of Debian? Editors' note: The question was misunderstood as the worst thing you think of when you think of Debian [Andreas]:
The worst thing I think about Debian, it's complicated. I think today on Debian board I was asked about the technical progress I want to make and in my opinion we need to standardize things inside Debian. For instance, bringing all the packages to salsa, follow some common standards, some common workflow which is extremely helpful. As I said, if I'm that productive with my own packages we can adopt this in general, at least in most cases I think. I made a lot of good experience by the support of well-formed teams. Well-formed teams are those teams where people support each other, help each other. For instance, how to say, I'm a physicist by profession so I'm not an IT expert. I can tell apart what works and what not but I'm not an expert in those packages. I do and the amount of packages is so high that I do not even understand all the techniques they are covering like Go, Rust and something like this. And I also don't speak Java and I had a problem once in the middle of the night and I've sent the email to the list and was a Java problem and I woke up in the morning and it was solved. This is what I call a team. I don't call a team some common repository that is used by random people for different packages also but it's working together, don't hesitate to solve other people's problems and permit people to get active. This is what I call a team and this is also something I observed in, it's hard to give a percentage, in a lot of other teams but we have other people who do not even understand the concept of the team. Why is working together make some advantage and this is also a tough thing. I [would] like to tackle in my term if I get elected to form solid teams using the common workflow. This is one thing. The other thing is that we have a lot of good people in our infrastructure like FTP masters, DSA and so on. I have the feeling they have a lot of work and are working more or less on their limits, and I like to talk to them [to ask] what kind of change we could do to move that limits or move their personal health to the better side.
The DPL term lasts for a year, What would you do during that you couldn't do now? [Andreas]:
Yeah, well this is basically what I said are my main issues. I need to admit I have no really clear imagination what kind of tasks will come to me as a DPL because all these financial issues and law issues possible and issues [that] people who are not really friendly to Debian might create. I'm afraid these things might occupy a lot of time and I can't say much about this because I simply don't know.
What are three key terms about you and your candidacy? [Andreas]:
As I said, I like to work on standards, I d like to make Debian try [to get it right so] that people don't get overworked, this third key point is be inviting to newcomers, to everybody who wants to come. Yeah, I also mentioned in my term this diversity issue, geographical and from gender point of view. This may be the three points I consider most important.
Preferred text editor? [Andreas]:
Yeah, my preferred one? Ah, well, I have no preferred text editor. I'm using the Midnight Commander very frequently which has an internal editor which is convenient for small text. For other things, I usually use VI but I also use Emacs from time to time. So, no, I have not preferred text editor. Whatever works nicely for me.
What is the importance of the community in the Debian Project? How would like to see it evolving over the next few years? [Andreas]:
Yeah, I think the community is extremely important. So, I was on a lot of DebConfs. I think it's not really 20 but 17 or 18 DebCons and I really enjoyed these events every year because I met so many friends and met so many interesting people that it's really enriching my life and those who I never met in person but have read interesting things and yeah, Debian community makes really a part of my life.
And how do you think it should evolve specifically? [Andreas]:
Yeah, for instance, last year in Kochi, it became even clearer to me that the geographical diversity is a really strong point. Just discussing with some women from India who is afraid about not coming next year to Busan because there's a problem with Shanghai and so on. I'm not really sure how we can solve this but I think this is a problem at least I wish to tackle and yeah, this is an interesting point, the geographical diversity and I'm running the so-called mentoring of the month. This is a small project to attract newcomers for the Debian Med team which has the focus on medical packages and I learned that we had always men applying for this and so I said, okay, I dropped the constraint of medical packages. Any topic is fine, I teach you packaging but it must be someone who does not consider himself a man. I got only two applicants, no, actually, I got one applicant and one response which was kind of strange if I'm hunting for women or so. I did not understand but I got one response and interestingly, it was for me one of the least expected counters. It was from Iran and I met a very nice woman, very open, very skilled and gifted and did a good job or have even lose contact today and maybe we need more actively approach groups that are underrepresented. I don't know if what's a good means which I did but at least I tried and so I try to think about these kind of things.
What part of Debian has made you smile? What part of the project has kept you going all through the years? [Andreas]:
Well, the card game which is called Mao on the DebConf made me smile all the time. I admit I joined only two or three times even if I really love this kind of games but I was occupied by other stuff so this made me really smile. I also think the first online DebConf in 2020 made me smile because we had this kind of short video sequences and I tried to make a funny video sequence about every DebConf I attended before. This is really funny moments but yeah, it's not only smile but yeah. One thing maybe it's totally unconnected to Debian but I learned personally something in Debian that we have a do-ocracy and you can do things which you think that are right if not going in between someone else, right? So respect everybody else but otherwise you can do so. And in 2020 I also started to take trees which are growing widely in my garden and plant them into the woods because in our woods a lot of trees are dying and so I just do something because I can. I have the resource to do something, take the small tree and bring it into the woods because it does not harm anybody. I asked the forester if it is okay, yes, yes, okay. So everybody can do so but I think the idea to do something like this came also because of the free software idea. You have the resources, you have the computer, you can do something and you do something productive, right? And when thinking about this I think it was also my Debian work. Meanwhile I have planted more than 3,000 trees so it's not a small number but yeah, I enjoy this.
What part of Debian would you have some criticisms for? [Andreas]:
Yeah, it's basically the same as I said before. We need more standards to work together. I do not want to repeat this but this is what I think, yeah.
What field in Free Software generally do you think requires the most work to be put into it? What do you think is Debian's part in the field? [Andreas]:
It's also in general, the thing is the fact that I'm maintaining packages which are usually as modern software is maintained in Git, which is fine but we have some software which is at Sourceport, we have software laying around somewhere, we have software where Debian somehow became Upstream because nobody is caring anymore and free software is very different in several things, ways and well, I in principle like freedom of choice which is the basic of all our work. Sometimes this freedom goes in the way of productivity because everybody is free to re-implement. You asked me for the most favorite editor. In principle one really good working editor would be great to have and would work and we have maybe 500 in Debian or so, I don't know. I could imagine if people would concentrate and say five instead of 500 editors, we could get more productive, right? But I know this will not happen, right? But I think this is one thing which goes in the way of making things smooth and productive and we could have more manpower to replace one person who's [having] children, doing some other stuff and can't continue working on something and maybe this is a problem I will not solve, definitely not, but which I see.
What do you think is Debian's part in the field? [Andreas]:
Yeah, well, okay, we can bring together different Upstreams, so we are building some packages and have some general overview about similar things and can say, oh, you are doing this and some other person is doing more or less the same, do you want to join each other or so, but this is kind of a channel we have to our Upstreams which is probably not very successful. It starts with code copies of some libraries which are changed a little bit, which is fine license-wise, but not so helpful for different things and so I've tried to convince those Upstreams to forward their patches to the original one, but for this and I think we could do some kind of, yeah, [find] someone who brings Upstream together or to make them stop their forking stuff, but it costs a lot of energy and we probably don't have this and it's also not realistic that we can really help with this problem.
Do you have any questions for me? [Andreas]:
I enjoyed the interview, I enjoyed seeing you again after half a year or so. Yeah, actually I've seen you in the eating room or cheese and wine party or so, I do not remember we had to really talk together, but yeah, people around, yeah, for sure. Yeah.

2 April 2024

Dirk Eddelbuettel: ulid 0.3.1 on CRAN: New Maintainer, Some Polish

Happy to share that ulid is now (back) on CRAN. It provides universally unique identifiers that are lexicographically sortable, which improves over the more well-known uuid generators. ulid is a neat little package put together by Bob Rudis a few years ago. It had recently drifted off CRAN so I offered to brush it up and re-submit it. And as tooted earlier today, it took just over an hour to finish that (after the lead up work I had done, including prior email with CRAN in the loop, the repo transfer from Bob s to my ulid repo plus of course a wee bit of actual maintenance; see below for more). The NEWS entry follows.

Changes in version 0.3.1 (2024-04-02)
  • New Maintainer
  • Deleted several repository files no longer used or needed
  • Added .editorconfig, ChangeLog and cleanup
  • Converted NEWS.md to NEWS.Rd
  • Simplified R/ directory to one source file
  • Simplified src/ removing redundant Makevars
  • Added ulid() alias
  • Updated / edited roxygen and README.md documention
  • Removed vignette which was identical to README.md
  • Switched continuous integration to GitHub Actions
  • Placed upstream (header-only) library into src/ulid/
  • Renamed single interface file to src/wrapper

If you like this or other open-source work I do, you can sponsor me at GitHub.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

23 March 2024

Dirk Eddelbuettel: littler 0.3.20 on CRAN: Moar Features!

max-heap image The twentyfirst release of littler as a CRAN package landed on CRAN just now, following in the now eighteen year history (!!) as a package started by Jeff in 2006, and joined by me a few weeks later. littler is the first command-line interface for R as it predates Rscript. It allows for piping as well for shebang scripting via #!, uses command-line arguments more consistently and still starts faster. It also always loaded the methods package which Rscript only began to do in recent years. littler lives on Linux and Unix, has its difficulties on macOS due to yet-another-braindeadedness there (who ever thought case-insensitive filesystems as a default were a good idea?) and simply does not exist on Windows (yet the build system could be extended see RInside for an existence proof, and volunteers are welcome!). See the FAQ vignette on how to add it to your PATH. A few examples are highlighted at the Github repo:, as well as in the examples vignette. This release contains another fair number of small changes and improvements to some of the scripts I use daily to build or test packages, adds a new front-end ciw.r for the recently-released ciw package offering a CRAN Incoming Watcher , a new helper installDeps2.r (extending installDeps.r), a new doi-to-bib converter, allows a different temporary directory setup I find helpful, deals with one corner deployment use, and more. The full change description follows.

Changes in littler version 0.3.20 (2024-03-23)
  • Changes in examples scripts
    • New (dependency-free) helper installDeps2.r to install dependencies
    • Scripts rcc.r, tt.r, tttf.r, tttlr.r use env argument -S to set -t to r
    • tt.r can now fill in inst/tinytest if it is present
    • New script ciw.r wrapping new package ciw
    • tttf.t can now use devtools and its loadall
    • New script doi2bib.r to call the DOI converter REST service (following a skeet by Richard McElreath)
  • Changes in package
    • The CI setup uses checkout@v4 and the r-ci-setup action
    • The Suggests: is a little tighter as we do not list all packages optionally used in the the examples (as R does not check for it either)
    • The package load messag can account for the rare build of R under different architecture (Berwin Turlach in #117 closing #116)
    • In non-vanilla mode, the temporary directory initialization in re-run allowing for a non-standard temp dir via config settings

My CRANberries service provides a comparison to the previous release. Full details for the littler release are provided as usual at the ChangeLog page, and also on the package docs website. The code is available via the GitHub repo, from tarballs and now of course also from its CRAN page and via install.packages("littler"). Binary packages are available directly in Debian as well as (in a day or two) Ubuntu binaries at CRAN thanks to the tireless Michael Rutter. Comments and suggestions are welcome at the GitHub repo. If you like this or other open-source work I do, you can sponsor me at GitHub.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

20 March 2024

Dirk Eddelbuettel: ciw 0.0.2 on CRAN: Updates

A first revision of the still only one-week old (at CRAN) package ciw has been released to CRAN! It provides is a single (efficient) function incoming() (now along with an alias ciw()) which summarises the state of the incoming directories at CRAN. I happen to like having these things at my (shell) fingertips, so it goes along with (still draft) wrapper ciw.r that will be part of the next littler release. For example, when I do this right now as I type this, I see (typically less than one second later)
edd@rob:~$ ciw.r 
    Folder                     Name                Time   Size         Age
    <char>                   <char>              <POSc> <char>  <difftime>
1: pretest instantiate_0.2.2.tar.gz 2024-03-20 13:29:00    17K  0.07 hours
2: recheck   tinytable_0.2.0.tar.gz 2024-03-20 12:50:00   565K  0.72 hours
3: pending      Matrix_1.7-0.tar.gz 2024-03-20 12:05:00   2.3M  1.47 hours
4: recheck      survey_4.4-2.tar.gz 2024-03-20 02:02:00   2.2M 11.52 hours
5: waiting   equateIRT_2.4.0.tar.gz 2024-03-19 17:00:00   895K 20.55 hours
6: pending   ravetools_0.1.5.tar.gz 2024-03-19 12:06:00   1.0M 25.45 hours
7: waiting     glmmTMB_1.1.9.tar.gz 2024-03-18 16:04:00   4.2M 45.48 hours
edd@rob:~$ 
See ciw.r --help or ciw.r --usage for more. Alternatively, in your R session, you can call ciw::incoming() (or now ciw::ciw()) for the same result (and/or load the package first). This release adds some packaging touches, brings the new alias ciw() as well as a state variable with all (known) folder names and some internal improvements for dealing with error conditions. The NEWS entry follows.

Changes in version 0.0.2 (2024-03-20)
  • The package README and DESCRIPTION have been expanded
  • An alias ciw can now be used for incoming
  • Network error handling is now more robist
  • A state variable known_folders lists all CRAN folders below incoming

Courtesy of my CRANberries, there is also a diffstat report for this release. If you like this or other open-source work I do, you can sponsor me at GitHub.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

14 March 2024

Dirk Eddelbuettel: ciw 0.0.1 on CRAN: New Package!

Happy to share that ciw is now on CRAN! I had tooted a little bit about it, e.g., here. What it provides is a single (efficient) function incoming() which summarises the state of the incoming directories at CRAN. I happen to like having these things at my (shell) fingertips, so it goes along with (still draft) wrapper ciw.r that will be part of the next littler release. For example, when I do this right now as I type this, I see
edd@rob:~$ ciw.r
    Folder                   Name                Time   Size          Age
    <char>                 <char>              <POSc> <char>   <difftime>
1: waiting   maximin_1.0-5.tar.gz 2024-03-13 22:22:00    20K   2.48 hours
2: inspect    GofCens_0.97.tar.gz 2024-03-13 21:12:00    29K   3.65 hours
3: inspect verbalisr_0.5.2.tar.gz 2024-03-13 20:09:00    79K   4.70 hours
4: waiting    rnames_1.0.1.tar.gz 2024-03-12 15:04:00   2.7K  33.78 hours
5: waiting  PCMBase_1.2.14.tar.gz 2024-03-10 12:32:00   406K  84.32 hours
6: pending        MPCR_1.1.tar.gz 2024-02-22 11:07:00   903K 493.73 hours
edd@rob:~$ 
which is rather compact as CRAN kept busy! This call runs in about (or just over) one second, which includes launching r. Good enough for me. From a well-connected EC2 instance it is about 800ms on the command-line. When I do I from here inside an R session it is maybe 700ms. And doing it over in Europe is faster still. (I am using ping=FALSE for these to omit the default sanity check of can I haz networking? to speed things up. The check adds another 200ms or so.) The function (and the wrapper) offer a ton of options too this is ridiculously easy to do thanks to the docopt package:
edd@rob:~$ ciw.r -x
Usage: ciw.r [-h] [-x] [-a] [-m] [-i] [-t] [-p] [-w] [-r] [-s] [-n] [-u] [-l rows] [-z] [ARG...]

-m --mega           use 'mega' mode of all folders (see --usage)
-i --inspect        visit 'inspect' folder
-t --pretest        visit 'pretest' folder
-p --pending        visit 'pending' folder
-w --waiting        visit 'waiting' folder
-r --recheck        visit 'waiting' folder
-a --archive        visit 'archive' folder
-n --newbies        visit 'newbies' folder
-u --publish        visit 'publish' folder
-s --skipsort       skip sorting of aggregate results by age
-l --lines rows     print top 'rows' of the result object [default: 50]
-z --ping           run the connectivity check first
-h --help           show this help text
-x --usage          show help and short example usage 

where ARG... can be one or more file name, or directories or package names.

Examples:
  ciw.r -ip                            # run in 'inspect' and 'pending' mode
  ciw.r -a                             # run with mode 'auto' resolved in incoming()
  ciw.r                                # run with defaults, same as '-itpwr'

When no argument is given, 'auto' is selected which corresponds to 'inspect', 'waiting',
'pending', 'pretest', and 'recheck'. Selecting '-m' or '--mega' are select as default.

Folder selecting arguments are cumulative; but 'mega' is a single selections of all folders
(i.e. 'inspect', 'waiting', 'pending', 'pretest', 'recheck', 'archive', 'newbies', 'publish').

ciw.r is part of littler which brings 'r' to the command-line.
See https://dirk.eddelbuettel.com/code/littler.html for more information.
edd@rob:~$ 
The README at the git repo and the CRAN page offer a screenshot movie showing some of the options in action. I have been using the little tools quite a bit over the last two or three weeks since I first put it together and find it quite handy. With that again a big Thank You! of appcreciation for all that CRAN does which this week included letting this past the newbies desk in under 24 hours. If you like this or other open-source work I do, you can sponsor me at GitHub.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

11 March 2024

Dirk Eddelbuettel: digest 0.6.35 on CRAN: New xxhash code

Release 0.6.35 of the digest package arrived at CRAN today and has also been uploaded to Debian already. digest creates hash digests of arbitrary R objects. It can use a number different hashing algorithms (md5, sha-1, sha-256, sha-512, crc32, xxhash32, xxhash64, murmur32, spookyhash, blake3,crc32c and now also xxh3_64 and xxh3_128), and enables easy comparison of (potentially large and nested) R language objects as it relies on the native serialization in R. It is a mature and widely-used package (with 65.8 million downloads just on the partial cloud mirrors of CRAN which keep logs) as many tasks may involve caching of objects for which it provides convenient general-purpose hash key generation to quickly identify the various objects. This release updates the included xxHash version to the current verion 0.8.2 updating the existing xxhash32 and xxhash64 hash functions and also adding the newer xxh3_64 and xxh3_128 ones. We have a project at work using xxh3_128 from Python which made me realize having it from R would be nice too, and given the existing infrastructure in the package actually doing so was fairly quick and straightforward. My CRANberries provides a summary of changes to the previous version. For questions or comments use the issue tracker off the GitHub repo. For documentation (including the changelog) see the documentation site. If you like this or other open-source work I do, you can now sponsor me at GitHub.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

9 March 2024

Reproducible Builds: Reproducible Builds in February 2024

Welcome to the February 2024 report from the Reproducible Builds project! In our reports, we try to outline what we have been up to over the past month as well as mentioning some of the important things happening in software supply-chain security.

Reproducible Builds at FOSDEM 2024 Core Reproducible Builds developer Holger Levsen presented at the main track at FOSDEM on Saturday 3rd February this year in Brussels, Belgium. However, that wasn t the only talk related to Reproducible Builds. However, please see our comprehensive FOSDEM 2024 news post for the full details and links.

Maintainer Perspectives on Open Source Software Security Bernhard M. Wiedemann spotted that a recent report entitled Maintainer Perspectives on Open Source Software Security written by Stephen Hendrick and Ashwin Ramaswami of the Linux Foundation sports an infographic which mentions that 56% of [polled] projects support reproducible builds .

Mailing list highlights From our mailing list this month:

Distribution work In Debian this month, 5 reviews of Debian packages were added, 22 were updated and 8 were removed this month adding to Debian s knowledge about identified issues. A number of issue types were updated as well. [ ][ ][ ][ ] In addition, Roland Clobus posted his 23rd update of the status of reproducible ISO images on our mailing list. In particular, Roland helpfully summarised that all major desktops build reproducibly with bullseye, bookworm, trixie and sid provided they are built for a second time within the same DAK run (i.e. [within] 6 hours) and that there will likely be further work at a MiniDebCamp in Hamburg. Furthermore, Roland also responded in-depth to a query about a previous report
Fedora developer Zbigniew J drzejewski-Szmek announced a work-in-progress script called fedora-repro-build that attempts to reproduce an existing package within a koji build environment. Although the projects README file lists a number of fields will always or almost always vary and there is a non-zero list of other known issues, this is an excellent first step towards full Fedora reproducibility.
Jelle van der Waa introduced a new linter rule for Arch Linux packages in order to detect cache files leftover by the Sphinx documentation generator which are unreproducible by nature and should not be packaged. At the time of writing, 7 packages in the Arch repository are affected by this.
Elsewhere, Bernhard M. Wiedemann posted another monthly update for his work elsewhere in openSUSE.

diffoscope diffoscope is our in-depth and content-aware diff utility that can locate and diagnose reproducibility issues. This month, Chris Lamb made a number of changes such as uploading versions 256, 257 and 258 to Debian and made the following additional changes:
  • Use a deterministic name instead of trusting gpg s use-embedded-filenames. Many thanks to Daniel Kahn Gillmor dkg@debian.org for reporting this issue and providing feedback. [ ][ ]
  • Don t error-out with a traceback if we encounter struct.unpack-related errors when parsing Python .pyc files. (#1064973). [ ]
  • Don t try and compare rdb_expected_diff on non-GNU systems as %p formatting can vary, especially with respect to MacOS. [ ]
  • Fix compatibility with pytest 8.0. [ ]
  • Temporarily fix support for Python 3.11.8. [ ]
  • Use the 7zip package (over p7zip-full) after a Debian package transition. (#1063559). [ ]
  • Bump the minimum Black source code reformatter requirement to 24.1.1+. [ ]
  • Expand an older changelog entry with a CVE reference. [ ]
  • Make test_zip black clean. [ ]
In addition, James Addison contributed a patch to parse the headers from the diff(1) correctly [ ][ ] thanks! And lastly, Vagrant Cascadian pushed updates in GNU Guix for diffoscope to version 255, 256, and 258, and updated trydiffoscope to 67.0.6.

reprotest reprotest is our tool for building the same source code twice in different environments and then checking the binaries produced by each build for any differences. This month, Vagrant Cascadian made a number of changes, including:
  • Create a (working) proof of concept for enabling a specific number of CPUs. [ ][ ]
  • Consistently use 398 days for time variation rather than choosing randomly and update README.rst to match. [ ][ ]
  • Support a new --vary=build_path.path option. [ ][ ][ ][ ]

Website updates There were made a number of improvements to our website this month, including:

Reproducibility testing framework The Reproducible Builds project operates a comprehensive testing framework (available at tests.reproducible-builds.org) in order to check packages and other artifacts for reproducibility. In February, a number of changes were made by Holger Levsen:
  • Debian-related changes:
    • Temporarily disable upgrading/bootstrapping Debian unstable and experimental as they are currently broken. [ ][ ]
    • Use the 64-bit amd64 kernel on all i386 nodes; no more 686 PAE kernels. [ ]
    • Add an Erlang package set. [ ]
  • Other changes:
    • Grant Jan-Benedict Glaw shell access to the Jenkins node. [ ]
    • Enable debugging for NetBSD reproducibility testing. [ ]
    • Use /usr/bin/du --apparent-size in the Jenkins shell monitor. [ ]
    • Revert reproducible nodes: mark osuosl2 as down . [ ]
    • Thanks again to Codethink, for they have doubled the RAM on our arm64 nodes. [ ]
    • Only set /proc/$pid/oom_score_adj to -1000 if it has not already been done. [ ]
    • Add the opemwrt-target-tegra and jtx task to the list of zombie jobs. [ ][ ]
Vagrant Cascadian also made the following changes:
  • Overhaul the handling of OpenSSH configuration files after updating from Debian bookworm. [ ][ ][ ]
  • Add two new armhf architecture build nodes, virt32z and virt64z, and insert them into the Munin monitoring. [ ][ ] [ ][ ]
In addition, Alexander Couzens updated the OpenWrt configuration in order to replace the tegra target with mpc85xx [ ], Jan-Benedict Glaw updated the NetBSD build script to use a separate $TMPDIR to mitigate out of space issues on a tmpfs-backed /tmp [ ] and Zheng Junjie added a link to the GNU Guix tests [ ]. Lastly, node maintenance was performed by Holger Levsen [ ][ ][ ][ ][ ][ ] and Vagrant Cascadian [ ][ ][ ][ ].

Upstream patches The Reproducible Builds project detects, dissects and attempts to fix as many currently-unreproducible packages as possible. We endeavour to send all of our patches upstream where appropriate. This month, we wrote a large number of such patches, including:

If you are interested in contributing to the Reproducible Builds project, please visit our Contribute page on our website. However, you can get in touch with us via:

7 March 2024

Dirk Eddelbuettel: prrd 0.0.6 at CRAN: Several Improvements

Thrilled to share that a new version of prrd arrived at CRAN yesterday in a first update in two and a half years. prrd facilitates the parallel running [of] reverse dependency [checks] when preparing R packages. It is used extensively for releases I make of Rcpp, RcppArmadillo, RcppEigen, BH, and others. prrd screenshot image The key idea of prrd is simple, and described in some more detail on its webpage and its GitHub repo. Reverse dependency checks are an important part of package development that is easily done in a (serial) loop. But these checks are also generally embarassingly parallel as there is no or little interdependency between them (besides maybe shared build depedencies). See the (dated) screenshot (running six parallel workers, arranged in a split byobu session). This release, the first since 2021, brings a number of enhancments. In particular, the summary function is now improved in several ways. Josh also put in a nice PR that generalizes some setup defaults and values. The release is summarised in the NEWS entry:

Changes in prrd version 0.0.6 (2024-03-06)
  • The summary function has received several enhancements:
    • Extended summary is only running when failures are seen.
    • The summariseQueue function now displays an anticipated completion time and remaining duration.
    • The use of optional package foghorn has been refined, and refactored, when running summaries.
  • The dequeueJobs.r scripts can receive a date argument, the date can be parse via anydate if anytime ins present.
  • The enqueeJobs.r now considers skipped package when running 'addfailed' while ensuring selecting packages are still on CRAN.
  • The CI setup has been updated (twice),
  • Enqueing and dequing functions and scripts now support relative directories, updated documentation (#18 by Joshua Ulrich).

Courtesy of my CRANberries, there is also a diffstat report for this release. If you like this or other open-source work I do, you can sponsor me at GitHub.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

4 March 2024

Dirk Eddelbuettel: tinythemes 0.0.2 at CRAN: Maintenance

A first maintenance of the still fairly new package tinythemes arrived on CRAN today. tinythemes provides the theme_ipsum_rc() function from hrbrthemes by Bob Rudis in a zero (added) dependency way. A simple example is (also available as a demo inside the package) contrasts the default style (on left) with the one added by this package (on the right): This version mostly just updates to the newest releases of ggplot2 as one must, and takes advantage of Bob s update to hrbrthemes yesterday. The full set of changes since the initial CRAN release follows.

Changes in spdl version 0.0.2 (2024-03-04)
  • Added continuous integrations action based on r2u
  • Added demo/ directory and a READNE.md
  • Minor edits to help page content
  • Synchronised with ggplot2 3.5.0 via hrbrthemes

Courtesy of my CRANberries, there is a diffstat report relative to previous release. More detailed information is on the repo where comments and suggestions are welcome. If you like this or other open-source work I do, you can sponsor me at GitHub.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

3 March 2024

Dirk Eddelbuettel: RcppArmadillo 0.12.8.1.0 on CRAN: Upstream Fix, Interface Polish

armadillo image Armadillo is a powerful and expressive C++ template library for linear algebra and scientific computing. It aims towards a good balance between speed and ease of use, has a syntax deliberately close to Matlab, and is useful for algorithm development directly in C++, or quick conversion of research code into production environments. RcppArmadillo integrates this library with the R environment and language and is widely used by (currently) 1130 other packages on CRAN, downloaded 32.8 million times (per the partial logs from the cloud mirrors of CRAN), and the CSDA paper (preprint / vignette) by Conrad and myself has been cited 578 times according to Google Scholar. This release brings a new upstream bugfix release Armadillo 12.8.1 prepared by Conrad yesterday. It was delayed for a few hours as CRAN noticed an error in one package which we all concluded was spurious as it could be reproduced outside of the one run there. Following from the previous release, we also use the slighty faster Lighter header in the examples. And once it got to CRAN I also updated the Debian package. The set of changes since the last CRAN release follows.

Changes in RcppArmadillo version 0.12.8.1.0 (2024-03-02)
  • Upgraded to Armadillo release 12.8.1 (Cortisol Injector)
    • Workaround in norm() for yet another bug in macOS accelerate framework
  • Update README for RcppArmadillo usage counts
  • Update examples to use '#include <RcppArmadillo/Lighter>' for faster compilation excluding unused Rcpp features

Courtesy of my CRANberries, there is a diffstat report relative to previous release. More detailed information is on the RcppArmadillo page. Questions, comments etc should go to the rcpp-devel mailing list off the Rcpp R-Forge page. If you like this or other open-source work I do, you can sponsor me at GitHub.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

28 February 2024

Dirk Eddelbuettel: RcppEigen 0.3.4.0.0 on CRAN: New Upstream, At Last

We are thrilled to share that RcppEigen has now upgraded to Eigen release 3.4.0! The new release 0.3.4.0.0 arrived on CRAN earlier today, and has been shipped to Debian as well. Eigen is a C++ template library for linear algebra: matrices, vectors, numerical solvers, and related algorithms. This update has been in the works for a full two and a half years! It all started with a PR #102 by Yixuan bringing the package-local changes for R integration forward to usptream release 3.4.0. We opened issue #103 to steer possible changes from reverse-dependency checking through. Lo and behold, this just stalled because a few substantial changes were needed and not coming. But after a long wait, and like a bolt out of a perfectly blue sky, Andrew revived it in January with a reverse depends run of his own along with a set of PRs. That was the push that was needed, and I steered it along with a number of reverse dependency checks, and occassional emails to maintainers. We managed to bring it down to only three packages having a hickup, and all three had received PRs thanks to Andrew and even merged them. So the plan became to release today following a final fourteen day window. And CRAN was convinced by our arguments that we followed due process. So there it is! Big big thanks to all who helped it along, especially Yixuan and Andrew but also Mikael who updated another patch set he had prepared for the previous release series. The complete NEWS file entry follows.

Changes in RcppEigen version 0.3.4.0.0 (2024-02-28)
  • The Eigen version has been upgrade to release 3.4.0 (Yixuan)
  • Extensive reverse-dependency checks ensure only three out of over 400 packages at CRAN are affected; PRs and patches helped other packages
  • The long-running branch also contains substantial contributions from Mikael Jagan (for the lme4 interface) and Andrew Johnson (revdep PRs)

Courtesy of CRANberries, there is also a diffstat report for the most recent release. If you like this or other open-source work I do, you can sponsor me at GitHub.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

9 February 2024

Abhijith PA: A new(kind of) phone

I was using a refurbished Xiaomi Redmi 6 Pro (codename: sakura . I do remember buying this phone to run LineageOS as I found this model had support. But by the time I bought it (there was a quite a gap between my research and the actual purchase) lineageOS ended the suport for this device. The maintainer might ve ended this port, I don t blame them. Its a non rewarding work. Later I found there is a DotOS custom rom available. I did a week of test run. There seems lot of bugs and it all was minor and I can live with that. Since there is no other official port for redmi 6 pro at the time, I settled on this buggy DotOS nightly build. The phone was doing fine all these(3+) years, but recently the camera started showing greyish dots on some areas to a point that I can t even properly take a photo. The outer body of the phone wasn t original, as it was a refurbished, thus the colors started to peel off and wasn t aesthetically pleasing. Then there is the Location detection issue which took a while to figure out location and battery drains fast. I recently started mapping public transportation routes, and the GPS issue was a problem for me. With all these problems combined, I decided to buy a new phone. But choosing a phone wasn t easy. There are far too many options in the market. However, one thing I was quite sure of was that I wouldn t be buying a brand new phone, but a refurbished one. It is my protest against all these phone manufacturers who have convinced the general public that a mobile phone s lifespan is only 2 years. Of course, there are a few exceptions, but the majority of players in the Indian market are not. Now I think about it, I haven t bought any brand new computing device, be it phones or laptops since 2015. All are refurbished devices except for an Orange pi. My Samsung R439 laptop which I already blogged about is still going strong. Back to picking phone. So I began by comparing LineageOS website to an online refurbished store to pick a phone that has an official lineage support then to reddit search for any reported hardware failures Google Pixel phones are well-known in the custom ROM community for ease of installation, good hardware and Android security releases. My above claims are from privacyguides.org and XDA-developers. I was convinced to go with this choice. But I have some one at home who is at the age of throwing things. So I didn t want to risk buying an expensive Pixel phone only to have it end up with a broken screen and edges. Perhaps I will consider buying a Pixel next time. After doing some more research and cross comparison I landed up on Redmi note 9. Codename: merlinx. Based on my previous experience I knew it is going to be a pain to unlock the bootloader of Xiaomi phones and I was prepared for that but this time there was an extra hoop. The phone came with ROM MIUI version 13. A small search on XDA forums and reddit told unless we downgrade to 12 or so it is difficult to unlock bootloader for this device. And performing this wasn t exactly easy as we need tools like SP Flash etc. It took some time, but I ve completed the downgrade process. From there, the rest was cakewalk as everything was perfectly documented in LineageOS wiki. And ta-da, I have a phone running lineageOS. I been using it for some time and honestly I haven t come across any bug in my way of usage. One advice I will give to you if you are going to flash a custom ROM. There are plenty of videos about unlocking bootloader, installing ROMs in Youtube. Please REFRAIN from watching and experimenting it on your phone unless you figured a way to un brick your device first.

Reproducible Builds (diffoscope): diffoscope 256 released

The diffoscope maintainers are pleased to announce the release of diffoscope version 256. This version includes the following changes:
* CVE-2024-25711: Use a determistic name when extracting content from GPG
  artifacts instead of trusting the value of gpg's --use-embedded-filenames.
  This prevents a potential information disclosure vulnerability that could
  have been exploited by providing a specially-crafted GPG file with an
  embedded filename of, say, "../../.ssh/id_rsa".
  Many thanks to Daniel Kahn Gillmor <dkg@debian.org> for reporting this
  issue and providing feedback.
  (Closes: reproducible-builds/diffoscope#361)
* Temporarily fix support for Python 3.11.8 re. a potential regression
  with the handling of ZIP files. (See reproducible-builds/diffoscope#362)
You find out more by visiting the project homepage.

8 February 2024

Dirk Eddelbuettel: RcppArmadillo 0.12.8.0.0 on CRAN: New Upstream, Interface Polish

armadillo image Armadillo is a powerful and expressive C++ template library for linear algebra and scientific computing. It aims towards a good balance between speed and ease of use, has a syntax deliberately close to Matlab, and is useful for algorithm development directly in C++, or quick conversion of research code into production environments. RcppArmadillo integrates this library with the R environment and language and is widely used by (currently) 1119 other packages on CRAN, downloaded 32.5 million times (per the partial logs from the cloud mirrors of CRAN), and the CSDA paper (preprint / vignette) by Conrad and myself has been cited 575 times according to Google Scholar. This release brings a new (stable) upstream (minor) release Armadillo 12.8.0 prepared by Conrad two days ago. We, as usual, prepared a release candidate which we tested against the over 1100 CRAN packages using RcppArmadillo. This found no issues, which was confirmed by CRAN once we uploaded and so it arrived as a new release today in a fully automated fashion. We also made a small change that had been prepared by GitHub issue #400: a few internal header files that were cluttering the top-level of the include directory have been moved to internal directories. The standard header is of course unaffected, and the set of full / light / lighter / lightest headers (matching we did a while back in Rcpp) also continue to work as one expects. This change was also tested in a full reverse-dependency check in January but had not been released to CRAN yet. The set of changes since the last CRAN release follows.

Changes in RcppArmadillo version 0.12.8.0.0 (2024-02-06)
  • Upgraded to Armadillo release 12.8.0 (Cortisol Injector)
    • Faster detection of symmetric expressions by pinv() and rank()
    • Expanded shift() to handle sparse matrices
    • Expanded conv_to for more flexible conversions between sparse and dense matrices
    • Added cbrt()
    • More compact representation of integers when saving matrices in CSV format
  • Five non-user facing top-level include files have been removed (#432 closing #400 and building on #395 and #396)

Courtesy of my CRANberries, there is a diffstat report relative to previous release. More detailed information is on the RcppArmadillo page. Questions, comments etc should go to the rcpp-devel mailing list off the Rcpp R-Forge page. If you like this or other open-source work I do, you can sponsor me at GitHub.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

7 February 2024

Reproducible Builds: Reproducible Builds in January 2024

Welcome to the January 2024 report from the Reproducible Builds project. In these reports we outline the most important things that we have been up to over the past month. If you are interested in contributing to the project, please visit our Contribute page on our website.

How we executed a critical supply chain attack on PyTorch John Stawinski and Adnan Khan published a lengthy blog post detailing how they executed a supply-chain attack against PyTorch, a popular machine learning platform used by titans like Google, Meta, Boeing, and Lockheed Martin :
Our exploit path resulted in the ability to upload malicious PyTorch releases to GitHub, upload releases to [Amazon Web Services], potentially add code to the main repository branch, backdoor PyTorch dependencies the list goes on. In short, it was bad. Quite bad.
The attack pivoted on PyTorch s use of self-hosted runners as well as submitting a pull request to address a trivial typo in the project s README file to gain access to repository secrets and API keys that could subsequently be used for malicious purposes.

New Arch Linux forensic filesystem tool On our mailing list this month, long-time Reproducible Builds developer kpcyrd announced a new tool designed to forensically analyse Arch Linux filesystem images. Called archlinux-userland-fs-cmp, the tool is supposed to be used from a rescue image (any Linux) with an Arch install mounted to, [for example], /mnt. Crucially, however, at no point is any file from the mounted filesystem eval d or otherwise executed. Parsers are written in a memory safe language. More information about the tool can be found on their announcement message, as well as on the tool s homepage. A GIF of the tool in action is also available.

Issues with our SOURCE_DATE_EPOCH code? Chris Lamb started a thread on our mailing list summarising some potential problems with the source code snippet the Reproducible Builds project has been using to parse the SOURCE_DATE_EPOCH environment variable:
I m not 100% sure who originally wrote this code, but it was probably sometime in the ~2015 era, and it must be in a huge number of codebases by now. Anyway, Alejandro Colomar was working on the shadow security tool and pinged me regarding some potential issues with the code. You can see this conversation here.
Chris ends his message with a request that those with intimate or low-level knowledge of time_t, C types, overflows and the various parsing libraries in the C standard library (etc.) contribute with further info.

Distribution updates In Debian this month, Roland Clobus posted another detailed update of the status of reproducible ISO images on our mailing list. In particular, Roland helpfully summarised that all major desktops build reproducibly with bullseye, bookworm, trixie and sid provided they are built for a second time within the same DAK run (i.e. [within] 6 hours) . Additionally 7 of the 8 bookworm images from the official download link build reproducibly at any later time. In addition to this, three reviews of Debian packages were added, 17 were updated and 15 were removed this month adding to our knowledge about identified issues. Elsewhere, Bernhard posted another monthly update for his work elsewhere in openSUSE.

Community updates There were made a number of improvements to our website, including Bernhard M. Wiedemann fixing a number of typos of the term nondeterministic . [ ] and Jan Zerebecki adding a substantial and highly welcome section to our page about SOURCE_DATE_EPOCH to document its interaction with distribution rebuilds. [ ].
diffoscope is our in-depth and content-aware diff utility that can locate and diagnose reproducibility issues. This month, Chris Lamb made a number of changes such as uploading versions 254 and 255 to Debian but focusing on triaging and/or merging code from other contributors. This included adding support for comparing eXtensible ARchive (.XAR/.PKG) files courtesy of Seth Michael Larson [ ][ ], as well considerable work from Vekhir in order to fix compatibility between various and subtle incompatible versions of the progressbar libraries in Python [ ][ ][ ][ ]. Thanks!

Reproducibility testing framework The Reproducible Builds project operates a comprehensive testing framework (available at tests.reproducible-builds.org) in order to check packages and other artifacts for reproducibility. In January, a number of changes were made by Holger Levsen:
  • Debian-related changes:
    • Reduce the number of arm64 architecture workers from 24 to 16. [ ]
    • Use diffoscope from the Debian release being tested again. [ ]
    • Improve the handling when killing unwanted processes [ ][ ][ ] and be more verbose about it, too [ ].
    • Don t mark a job as failed if process marked as to-be-killed is already gone. [ ]
    • Display the architecture of builds that have been running for more than 48 hours. [ ]
    • Reboot arm64 nodes when they hit an OOM (out of memory) state. [ ]
  • Package rescheduling changes:
    • Reduce IRC notifications to 1 when rescheduling due to package status changes. [ ]
    • Correctly set SUDO_USER when rescheduling packages. [ ]
    • Automatically reschedule packages regressing to FTBFS (build failure) or FTBR (build success, but unreproducible). [ ]
  • OpenWrt-related changes:
    • Install the python3-dev and python3-pyelftools packages as they are now needed for the sunxi target. [ ][ ]
    • Also install the libpam0g-dev which is needed by some OpenWrt hardware targets. [ ]
  • Misc:
    • As it s January, set the real_year variable to 2024 [ ] and bump various copyright years as well [ ].
    • Fix a large (!) number of spelling mistakes in various scripts. [ ][ ][ ]
    • Prevent Squid and Systemd processes from being killed by the kernel s OOM killer. [ ]
    • Install the iptables tool everywhere, else our custom rc.local script fails. [ ]
    • Cleanup the /srv/workspace/pbuilder directory on boot. [ ]
    • Automatically restart Squid if it fails. [ ]
    • Limit the execution of chroot-installation jobs to a maximum of 4 concurrent runs. [ ][ ]
Significant amounts of node maintenance was performed by Holger Levsen (eg. [ ][ ][ ][ ][ ][ ][ ] etc.) and Vagrant Cascadian (eg. [ ][ ][ ][ ][ ][ ][ ][ ]). Indeed, Vagrant Cascadian handled an extended power outage for the network running the Debian armhf architecture test infrastructure. This provided the incentive to replace the UPS batteries and consolidate infrastructure to reduce future UPS load. [ ] Elsewhere in our infrastructure, however, Holger Levsen also adjusted the email configuration for @reproducible-builds.org to deal with a new SMTP email attack. [ ]

Upstream patches The Reproducible Builds project tries to detects, dissects and fix as many (currently) unreproducible packages as possible. We endeavour to send all of our patches upstream where appropriate. This month, we wrote a large number of such patches, including: Separate to this, Vagrant Cascadian followed up with the relevant maintainers when reproducibility fixes were not included in newly-uploaded versions of the mm-common package in Debian this was quickly fixed, however. [ ]

If you are interested in contributing to the Reproducible Builds project, please visit our Contribute page on our website. However, you can get in touch with us via:

Next.