Search Results: "Stefano Zacchiroli"

9 March 2024

Reproducible Builds: Reproducible Builds in February 2024

Welcome to the February 2024 report from the Reproducible Builds project! In our reports, we try to outline what we have been up to over the past month as well as mentioning some of the important things happening in software supply-chain security.

Reproducible Builds at FOSDEM 2024 Core Reproducible Builds developer Holger Levsen presented at the main track at FOSDEM on Saturday 3rd February this year in Brussels, Belgium. However, that wasn t the only talk related to Reproducible Builds. However, please see our comprehensive FOSDEM 2024 news post for the full details and links.

Maintainer Perspectives on Open Source Software Security Bernhard M. Wiedemann spotted that a recent report entitled Maintainer Perspectives on Open Source Software Security written by Stephen Hendrick and Ashwin Ramaswami of the Linux Foundation sports an infographic which mentions that 56% of [polled] projects support reproducible builds .

Three new reproducibility-related academic papers A total of three separate scholarly papers related to Reproducible Builds have appeared this month: Signing in Four Public Software Package Registries: Quantity, Quality, and Influencing Factors by Taylor R. Schorlemmer, Kelechi G. Kalu, Luke Chigges, Kyung Myung Ko, Eman Abdul-Muhd, Abu Ishgair, Saurabh Bagchi, Santiago Torres-Arias and James C. Davis (Purdue University, Indiana, USA) is concerned with the problem that:
Package maintainers can guarantee package authorship through software signing [but] it is unclear how common this practice is, and whether the resulting signatures are created properly. Prior work has provided raw data on signing practices, but measured single platforms, did not consider time, and did not provide insight on factors that may influence signing. We lack a comprehensive, multi-platform understanding of signing adoption and relevant factors. This study addresses this gap. (arXiv, full PDF)

Reproducibility of Build Environments through Space and Time by Julien Malka, Stefano Zacchiroli and Th o Zimmermann (Institut Polytechnique de Paris, France) addresses:
[The] principle of reusability [ ] makes it harder to reproduce projects build environments, even though reproducibility of build environments is essential for collaboration, maintenance and component lifetime. In this work, we argue that functional package managers provide the tooling to make build environments reproducible in space and time, and we produce a preliminary evaluation to justify this claim.
The abstract continues with the claim that Using historical data, we show that we are able to reproduce build environments of about 7 million Nix packages, and to rebuild 99.94% of the 14 thousand packages from a 6-year-old Nixpkgs revision. (arXiv, full PDF)
Options Matter: Documenting and Fixing Non-Reproducible Builds in Highly-Configurable Systems by Georges Aaron Randrianaina, Djamel Eddine Khelladi, Olivier Zendra and Mathieu Acher (Inria centre at Rennes University, France):
This paper thus proposes an approach to automatically identify configuration options causing non-reproducibility of builds. It begins by building a set of builds in order to detect non-reproducible ones through binary comparison. We then develop automated techniques that combine statistical learning with symbolic reasoning to analyze over 20,000 configuration options. Our methods are designed to both detect options causing non-reproducibility, and remedy non-reproducible configurations, two tasks that are challenging and costly to perform manually. (HAL Portal, full PDF)

Mailing list highlights From our mailing list this month:

User cen posted a query asking How to verify a package by rebuilding it locally on Debian which received a followup from Vagrant Cascadian.

James Addison asked Two questions about build-path reproducibility in Debian regarding the differences in the testing performed by Debian s GitLab continuous integration (CI) pipeline and the Debian-specific testing performed by the Reproducible Builds project itself, and followed this with a separate but related question regarding misconfigured reprotest configurations.

Distribution work In Debian this month, 5 reviews of Debian packages were added, 22 were updated and 8 were removed this month adding to Debian s knowledge about identified issues. A number of issue types were updated as well. [ ][ ][ ][ ] In addition, Roland Clobus posted his 23rd update of the status of reproducible ISO images on our mailing list. In particular, Roland helpfully summarised that all major desktops build reproducibly with bullseye, bookworm, trixie and sid provided they are built for a second time within the same DAK run (i.e. [within] 6 hours) and that there will likely be further work at a MiniDebCamp in Hamburg. Furthermore, Roland also responded in-depth to a query about a previous report
Fedora developer Zbigniew J drzejewski-Szmek announced a work-in-progress script called `fedora-repro-build` that attempts to reproduce an existing package within a koji build environment. Although the projects `README` file lists a number of fields will always or almost always vary and there is a non-zero list of other known issues, this is an excellent first step towards full Fedora reproducibility.
Jelle van der Waa introduced a new linter rule for Arch Linux packages in order to detect cache files leftover by the Sphinx documentation generator which are unreproducible by nature and should not be packaged. At the time of writing, 7 packages in the Arch repository are affected by this.
Elsewhere, Bernhard M. Wiedemann posted another monthly update for his work elsewhere in openSUSE.

diffoscope diffoscope is our in-depth and content-aware diff utility that can locate and diagnose reproducibility issues. This month, Chris Lamb made a number of changes such as uploading versions `256`, `257` and `258` to Debian and made the following additional changes:

Use a deterministic name instead of trusting `gpg` s use-embedded-filenames. Many thanks to Daniel Kahn Gillmor dkg@debian.org for reporting this issue and providing feedback. [ ][ ]

Don t error-out with a traceback if we encounter `struct.unpack`-related errors when parsing Python `.pyc` files. (#1064973). [ ]

Don t try and compare `rdb_expected_diff` on non-GNU systems as `%p` formatting can vary, especially with respect to MacOS. [ ]

Fix compatibility with `pytest` 8.0. [ ]

Temporarily fix support for Python 3.11.8. [ ]

Use the `7zip` package (over `p7zip-full`) after a Debian package transition. (#1063559). [ ]

Bump the minimum Black source code reformatter requirement to 24.1.1+. [ ]

Expand an older changelog entry with a CVE reference. [ ]

Make `test_zip` black clean. [ ]

In addition, James Addison contributed a patch to parse the headers from the `diff(1)` correctly [ ][ ] thanks! And lastly, Vagrant Cascadian pushed updates in GNU Guix for diffoscope to version 255, 256, and 258, and updated trydiffoscope to 67.0.6.

reprotest reprotest is our tool for building the same source code twice in different environments and then checking the binaries produced by each build for any differences. This month, Vagrant Cascadian made a number of changes, including:

Create a (working) proof of concept for enabling a specific number of CPUs. [ ][ ]

Consistently use 398 days for time variation rather than choosing randomly and update `README.rst` to match. [ ][ ]

Support a new `--vary=build_path.path` option. [ ][ ][ ][ ]

Website updates There were made a number of improvements to our website this month, including:

Chris Lamb:

Improve the relative sizing of headers. [ ]

Re-order and punch up the introduction and documentation on the `SOURCE_DATE_EPOCH` page. [ ]

Update `SOURCE_DATE_EPOCH` documentation re. `datetime.datetime.fromtimestamp`. Thanks, James Addison. [ ]

Add a post about Reproducible Builds at FOSDEM 2024. [ ]

Holger Levsen:

Update the GNU Guix page to include their reproducibility QA page. [ ]

Add Sune Vuorela and Jan-Benedict Glaw to our contributors list. [ ][ ]

Mattia Rizzolo:

Add Sovereign Tech Fund s logo to our sponsors. [ ]

Update our sponsors list. [ ]

Reproducibility testing framework The Reproducible Builds project operates a comprehensive testing framework (available at tests.reproducible-builds.org) in order to check packages and other artifacts for reproducibility. In February, a number of changes were made by Holger Levsen:

Debian-related changes:

Temporarily disable upgrading/bootstrapping Debian unstable and experimental as they are currently broken. [ ][ ]

Use the 64-bit `amd64` kernel on all `i386` nodes; no more 686 PAE kernels. [ ]

Add an Erlang package set. [ ]

Other changes:

Grant Jan-Benedict Glaw shell access to the Jenkins node. [ ]

Enable debugging for NetBSD reproducibility testing. [ ]

Use `/usr/bin/du --apparent-size` in the Jenkins shell monitor. [ ]

Revert reproducible nodes: mark osuosl2 as down . [ ]

Thanks again to Codethink, for they have doubled the RAM on our `arm64` nodes. [ ]

Only set `/proc/$pid/oom_score_adj` to -1000 if it has not already been done. [ ]

Add the `opemwrt-target-tegra` and `jtx` task to the list of zombie jobs. [ ][ ]

Vagrant Cascadian also made the following changes:

Overhaul the handling of OpenSSH configuration files after updating from Debian bookworm. [ ][ ][ ]

Add two new `armhf` architecture build nodes, `virt32z` and `virt64z`, and insert them into the Munin monitoring. [ ][ ] [ ][ ]

In addition, Alexander Couzens updated the OpenWrt configuration in order to replace the `tegra` target with `mpc85xx` [ ], Jan-Benedict Glaw updated the NetBSD build script to use a separate `$TMPDIR` to mitigate out of space issues on a tmpfs-backed `/tmp` [ ] and Zheng Junjie added a link to the GNU Guix tests [ ]. Lastly, node maintenance was performed by Holger Levsen [ ][ ][ ][ ][ ][ ] and Vagrant Cascadian [ ][ ][ ][ ].

Upstream patches The Reproducible Builds project detects, dissects and attempts to fix as many currently-unreproducible packages as possible. We endeavour to send all of our patches upstream where appropriate. This month, we wrote a large number of such patches, including:

Philip Rinn:

`gimagereader` (date)

Bernhard M. Wiedemann:

`grass` (date-related issue)

`grub2` (filesystem ordering issue)

`latex2html` (drop a non-deterministic log)

`mhvtl` (tar)

`obs` (build-tool issue)

`ollama` (GZip embedding the modification time)

`presenterm` (filesystem-ordering issue)

`qt6-quick3d` (parallelism)

Chris Lamb:

#1064506 filed against `geophar`.

#1064891 filed against `pytest-repeat`.

#1064892 filed against `klepto`.

James Addison:

#1064519 filed against `flask-limiter`.

`python-parsl-doc` (disable dynamic argument evaluation by Sphinx `autodoc` extension)

`python3-pytest-repeat` (remove `entry_points.txt` creation that varied by shell)

`python3-selinux` (remove packaged `direct_url.json` file that embeds build path)

`python3-sepolicy` (remove packaged `direct_url.json` file that embeds build path)

#1064575 filed against `pyswarms`.

#1064638 filed against `python-x2go`.

`snapd` (fix timestamp header in packaged manual-page)

`zzzeeksphinx` (existing RB patch forwarded and merged (with modifications))

Johannes Schauer Marin Rodrigues:

#1063939 filed against `fop`.

If you are interested in contributing to the Reproducible Builds project, please visit our Contribute page on our website. However, you can get in touch with us via:

IRC: `#reproducible-builds` on `irc.oftc.net`.

Twitter: @ReproBuilds

Mastodon: @reproducible_builds@fosstodon.org

Mailing list: `rb-general@lists.reproducible-builds.org`

22 January 2024

Chris Lamb: Increasing the Integrity of Software Supply Chains awarded IEEE Best Paper award

IEEE Software recently announced that a paper that I co-authored with Dr. Stefano Zacchiroli has recently been awarded their Best Paper award:

Titled Reproducible Builds: Increasing the Integrity of Software Supply Chains, the abstract reads as follows:

Although it is possible to increase confidence in Free and Open Source Software (FOSS) by reviewing its source code, trusting code is not the same as trusting its executable counterparts. These are typically built and distributed by third-party vendors with severe security consequences if their supply chains are compromised. In this paper, we present reproducible builds, an approach that can determine whether generated binaries correspond with their original source code. We first define the problem and then provide insight into the challenges of making real-world software build in a "reproducible" manner that is, when every build generates bit-for-bit identical results. Through the experience of the Reproducible Builds project making the Debian Linux distribution reproducible, we also describe the affinity between reproducibility and quality assurance (QA).

According to Google Scholar, the paper has accumulated almost 40 citations since publication. The full text of the paper can be found in PDF format.

11 January 2024

Reproducible Builds: Reproducible Builds in December 2023

Welcome to the December 2023 report from the Reproducible Builds project! In these reports we outline the most important things that we have been up to over the past month. As a rather rapid recap, whilst anyone may inspect the source code of free software for malicious flaws, almost all software is distributed to end users as pre-compiled binaries (more).

Reproducible Builds: Increasing the Integrity of Software Supply Chains awarded IEEE Software Best Paper award In February 2022, we announced in these reports that a paper written by Chris Lamb and Stefano Zacchiroli was now available in the March/April 2022 issue of IEEE Software. Titled Reproducible Builds: Increasing the Integrity of Software Supply Chains (PDF). This month, however, IEEE Software announced that this paper has won their Best Paper award for 2022.

Reproducibility to affect package migration policy in Debian In a post summarising the activities of the Debian Release Team at a recent in-person Debian event in Cambridge, UK, Paul Gevers announced a change to the way packages are migrated into the staging area for the next stable Debian release based on its reproducibility status:
The folks from the Reproducibility Project have come a long way since they started working on it 10 years ago, and we believe it s time for the next step in Debian. Several weeks ago, we enabled a migration policy in our migration software that checks for regression in reproducibility. At this moment, that is presented as just for info, but we intend to change that to delays in the not so distant future. We eventually want all packages to be reproducible. To stimulate maintainers to make their packages reproducible now, we ll soon start to apply a bounty [speedup] for reproducible builds, like we ve done with passing autopkgtests for years. We ll reduce the bounty for successful autopkgtests at that moment in time.

Speranza: Usable, privacy-friendly software signing Kelsey Merrill, Karen Sollins, Santiago Torres-Arias and Zachary Newman have developed a new system called Speranza, which is aimed at reassuring software consumers that the product they are getting has not been tampered with and is coming directly from a source they trust. A write-up on TechXplore.com goes into some more details:
What we have done, explains Sollins, is to develop, prove correct, and demonstrate the viability of an approach that allows the [software] maintainers to remain anonymous. Preserving anonymity is obviously important, given that almost everyone software developers included value their confidentiality. This new approach, Sollins adds, simultaneously allows [software] users to have confidence that the maintainers are, in fact, legitimate maintainers and, furthermore, that the code being downloaded is, in fact, the correct code of that maintainer. [ ]
The corresponding paper is published on the arXiv preprint server in various formats, and the announcement has also been covered in MIT News.

Nondeterministic Git bundles Paul Baecher published an interesting blog post on Reproducible git bundles. For those who are not familiar with them, Git bundles are used for the offline transfer of Git objects without an active server sitting on the other side of a network connection. Anyway, Paul wrote about writing a backup system for his entire system, but:
I noticed that a small but fixed subset of [Git] repositories are getting backed up despite having no changes made. That is odd because I would think that repeated bundling of the same repository state should create the exact same bundle. However [it] turns out that for some, repositories bundling is nondeterministic.
Paul goes on to to describe his solution, which involves forcing git to be single threaded makes the output deterministic . The article was also discussed on Hacker News.

Output from libxlst now deterministic

libxslt is the XSLT C library developed for the GNOME project, where XSLT itself is an XML language to define transformations for XML files. This month, it was revealed that the result of the generate-id() XSLT function is now deterministic across multiple transformations, fixing many issues with reproducible builds. As the Git commit by Nick Wellnhofer describes:

Rework the generate-id() function to return deterministic values. We use
a simple incrementing counter and store ids in the 'psvi' member of
nodes which was freed up by previous commits. The presence of an id is
indicated by a new "source node" flag.
This fixes long-standing problems with reproducible builds, see
https://bugzilla.gnome.org/show_bug.cgi?id=751621
This also hardens security, as the old implementation leaked the
difference between a heap and a global pointer, see
https://bugs.chromium.org/p/chromium/issues/detail?id=1356211
The old implementation could also generate the same id for dynamically
created nodes which happened to reuse the same memory. Ids for namespace
nodes were completely broken. They now use the id of the parent element
together with the hex-encoded namespace prefix.

Community updates There were made a number of improvements to our website, including Chris Lamb fixing the `generate-draft` script to not blow up if the input files have been corrupted today or even in the past [ ], Holger Levsen updated the Hamburg 2023 summit to add a link to farewell post [ ] & to add a picture of a Post-It note. [ ], and Pol Dellaiera updated the paragraph about `tar` and the `--clamp-mtime` flag [ ]. On our mailing list this month, Bernhard M. Wiedemann posted an interesting summary on some of the reasons why packages are still not reproducible in 2023. diffoscope is our in-depth and content-aware diff utility that can locate and diagnose reproducibility issues. This month, Chris Lamb made a number of changes, including processing `objdump` symbol comment filter inputs as Python `byte` (and not `str`) instances [ ] and Vagrant Cascadian extended diffoscope support for GNU Guix [ ] and updated the version in that distribution to version 253 [ ].

Challenges of Producing Software Bill Of Materials for Java Musard Balliu, Benoit Baudry, Sofia Bobadilla, Mathias Ekstedt, Martin Monperrus, Javier Ron, Aman Sharma, Gabriel Skoglund, C sar Soto-Valero and Martin Wittlinger (!) of the KTH Royal Institute of Technology in Sweden, have published an article in which they:
deep-dive into 6 tools and the accuracy of the SBOMs they produce for complex open-source Java projects. Our novel insights reveal some hard challenges regarding the accurate production and usage of software bills of materials.
The paper is available on arXiv.

Debian Non-Maintainer campaign As mentioned in previous reports, the Reproducible Builds team within Debian has been organising a series of online and offline sprints in order to clear the huge backlog of reproducible builds patches submitted by performing so-called NMUs (Non-Maintainer Uploads). During December, Vagrant Cascadian performed a number of such uploads, including:

`crack` [ ] (#1021521 & #1021522)

`dustmite` [ ] (#1020878 & #1020879)

`edid-decode` [ ] (#1020877)

`gentoo` [ ] (#1024284)

`haskell98-report` [ ] (#1024007)

`infinipath-psm` [ ] (#990862)

`lcm` [ ] (#1024286)

`libapache-mod-evasive` [ ] (#1020800)

`libccrtp` [ ] (#860470)

`libinput` [ ] (#995809)

`lirc` [ ] (#979019, #979023 & #979024)

`mm-common` [ ] (#977177)

`mpl-sphinx-theme` [ ] (#1005826)

`psi` [ ] (#1017473)

`python-parse-type` [ ] (#1002671)

`ruby-tioga` [ ] (#1005727)

`ucspi-proxy` [ ] (#1024125)

`ypserv` [ ] (#983138)

In addition, Holger Levsen performed three no-source-change NMUs in order to address the last packages without `.buildinfo` files in Debian trixie, specifically `lorene` (0.0.0~cvs20161116+dfsg-1.1), `maria` (1.3.5-4.2) and `ruby-rinku` (1.7.3-2.1).

Reproducibility testing framework The Reproducible Builds project operates a comprehensive testing framework (available at tests.reproducible-builds.org) in order to check packages and other artifacts for reproducibility. In December, a number of changes were made by Holger Levsen:

Debian-related changes:

Fix matching packages for the [R programming language](https://en.wikipedia.org/wiki/R_(programming_language). [ ][ ][ ]

Add a Certbot configuration for the Nginx web server. [ ]

Enable debugging for the `create-meta-pkgs` tool. [ ][ ]

Arch Linux-related changes

The `asp` has been deprecated by `pkgctl`; thanks to dvzrv for the pointer. [ ]

Disable the Arch Linux builders for now. [ ]

Stop referring to the `/trunk` branch / subdirectory. [ ]

Use `--protocol https` when cloning repositories using the `pkgctl` tool. [ ]

Misc changes:

Install the `python3-setuptools` and `swig` packages, which are now needed to build OpenWrt. [ ]

Install `pkg-config` needed to build Coreboot artifacts. [ ]

Detect failures due to an issue where the `fakeroot` tool is implicitly required but not automatically installed. [ ]

Detect failures due to rename of the `vmlinuz` file. [ ]

Improve the grammar of an error message. [ ]

Document that `freebsd-jenkins.debian.net` has been updated to FreeBSD 14.0. [ ]

In addition, node maintenance was performed by Holger Levsen [ ] and Vagrant Cascadian [ ].

Upstream patches The Reproducible Builds project detects, dissects and attempts to fix as many currently-unreproducible packages as possible. We endeavour to send all of our patches upstream where appropriate. This month, we wrote a large number of such patches, including:

Bernhard M. Wiedemann:

`apr` (hostname issue)

`dune` (parallelism)

`epy` (time-based `.pyc` issue)

`fpc` (Year 2038)

`gap` (date)

`gh` (FTBFS in 2024)

`kubernetes` (fixed random build path)

`libgda` (date)

`libguestfs` (tar)

`metamail` (date)

`mpi-selector` (date)

`neovim` (randomness in Lua)

`nml` (time-based `.pyc`)

`pommed` (parallelism)

`procmail` (benchmarking)

`pysnmp` (FTBFS in 2038)

`python-efl` (drop Sphinx doctrees)

`python-pyface` (time)

`python-pytest-salt-factories` (time-based `.pyc` issue)

`python-quimb` (fails to build on single-CPU systems)

`python-rdflib` (random)

`python-yarl` (random path)

`qt6-webengine` (parallelism issue in documentation)

`texlive` (Gzip modification time issue)

`waf` (time-based `.pyc`)

`warewulf` (CPIO modification time and inode issue)

`xemacs` (toolchain hostname)

Chris Lamb:

#1057710 filed against `python-aiostream`.

#1057721 filed against `openpyxl`.

#1058681 filed against `python-multipletau`.

#1059013 filed against `wxmplot`.

#1059014 filed against `stunnel4`.

James Addison:

#1059592 & #1059631 filed against `qttools-opensource-src`.

If you are interested in contributing to the Reproducible Builds project, please visit our Contribute page on our website. However, you can get in touch with us via:

IRC: `#reproducible-builds` on `irc.oftc.net`.

Mailing list: `rb-general@lists.reproducible-builds.org`

Mastodon: @reproducible_builds

Twitter: @ReproBuilds

13 July 2022

Reproducible Builds: Reproducible Builds in June 2022

Welcome to the June 2022 report from the Reproducible Builds project. In these reports, we outline the most important things that we have been up to over the past month. As a quick recap, whilst anyone may inspect the source code of free software for malicious flaws, almost all software is distributed to end users as pre-compiled binaries.

Save the date! Despite several delays, we are pleased to announce dates for our in-person summit this year: November 1st 2022 November 3rd 2022
The event will happen in/around Venice (Italy), and we intend to pick a venue reachable via the train station and an international airport. However, the precise venue will depend on the number of attendees. Please see the announcement mail from Mattia Rizzolo, and do keep an eye on the mailing list for further announcements as it will hopefully include registration instructions.

News David Wheeler filed an issue against the Rust programming language to report that builds are not reproducible because full path to the source code is in the panic and debug strings . Luckily, as one of the responses mentions: the `--remap-path-prefix` solves this problem and has been used to great effect in build systems that rely on reproducibility (Bazel, Nix) to work at all and that there are efforts to teach cargo about it here .
The Python Security team announced that:
The `ctx` hosted project on PyPI was taken over via user account compromise and replaced with a malicious project which contained runtime code which collected the content of `os.environ.items()` when instantiating Ctx objects. The captured environment variables were sent as a base64 encoded query parameter to a Heroku application [ ]
As their announcement later goes onto state, version-pinning using hash-checking mode can prevent this attack, although this does depend on specific installations using this mode, rather than a prevention that can be applied systematically.
Developer vanitasvitae published an interesting and entertaining blog post detailing the blow-by-blow steps of debugging a reproducibility issue in PGPainless, a library which aims to make using OpenPGP in Java projects as simple as possible . Whilst their in-depth research into the internals of the `.jar` may have been unnecessary given that diffoscope would have identified the, it must be said that there is something to be said with occasionally delving into seemingly low-level details, as well describing any debugging process. Indeed, as vanitasvitae writes:
Yes, this would have spared me from 3h of debugging But I probably would also not have gone onto this little dive into the JAR/ZIP format, so in the end I m not mad.

Kees Cook published a short and practical blog post detailing how he uses reproducibility properties to aid work to replace one-element arrays in the Linux kernel. Kees approach is based on the principle that if a (small) proposed change is considered equivalent by the compiler, then the generated output will be identical but only if no other arbitrary or unrelated changes are introduced. Kees mentions the fantastic diffoscope tool, as well as various kernel-specific build options (eg. `KBUILD_BUILD_TIMESTAMP`) in order to prepare my build with the known to disrupt code layout options disabled .
Stefano Zacchiroli gave a presentation at GDR S curit Informatique based in part on a paper co-written with Chris Lamb titled Increasing the Integrity of Software Supply Chains. (Tweet)

Debian In Debian in this month, 28 reviews of Debian packages were added, 35 were updated and 27 were removed this month adding to our knowledge about identified issues. Two issue types were added: `nondeterministic_checksum_generated_by_coq` and `nondetermistic_js_output_from_webpack`. After Holger Levsen found hundreds of packages in the bookworm distribution that lack `.buildinfo` files, he uploaded 404 source packages to the archive (with no meaningful source changes). Currently bookworm now shows only 8 packages without `.buildinfo` files, and those 8 are fixed in unstable and should migrate shortly. By contrast, Debian unstable will always have packages without `.buildinfo` files, as this is how they come through the NEW queue. However, as these packages were not built on the official build servers (ie. they were uploaded by the maintainer) they will never migrate to Debian testing. In the future, therefore, testing should never have packages without `.buildinfo` files again. Roland Clobus posted yet another in-depth status report about his progress making the Debian Live images build reproducibly to our mailing list. In this update, Roland mentions that all major desktops build reproducibly with bullseye, bookworm and sid but also goes on to outline the progress made with automated testing of the generated images using openQA.

GNU Guix Vagrant Cascadian made a significant number of contributions to GNU Guix:

Submitted patches to fix reproducibility issues in keyutils and isl as well as reported two bugs affecting reproducibility testing [ ][ ].

23 specific fixes related to reproducibility. [1][2][3][4][5][6][7][8][9][10][11][12][13][14][15][16][17][18][19][20][21][22][23]

Proposed setting `FORCE_SOURCE_DATE=1` in the environment of all builds in order to fix numerous timestamp issues in documentation generation tools.

Identified reproducibility issues in the `maradns` package as it appears to embed a random prime number. (Patch)

Responded in a thread to point out that GNU Guix already has the infrastructure in place to verify the reproducibility of downloaded substitutes for the vast majority of packages.

Lastly, Vagrant performed an evaluation of the unreproducible packages that remain in the distribution.

Elsewhere in GNU Guix, Ludovic Court s published a paper in the journal The Art, Science, and Engineering of Programming called Building a Secure Software Supply Chain with GNU Guix:
This paper focuses on one research question: how can [Guix]((https://www.gnu.org/software/guix/) and similar systems allow users to securely update their software? [ ] Our main contribution is a model and tool to authenticate new Git revisions. We further show how, building on Git semantics, we build protections against downgrade attacks and related threats. We explain implementation choices. This work has been deployed in production two years ago, giving us insight on its actual use at scale every day. The Git checkout authentication at its core is applicable beyond the specific use case of Guix, and we think it could benefit to developer teams that use Git.
A full PDF of the text is available.

openSUSE In the world of openSUSE, SUSE announced at SUSECon that they are preparing to meet SLSA level 4. (SLSA (Supply chain Levels for Software Artifacts) is a new industry-led standardisation effort that aims to protect the integrity of the software supply chain.) However, at the time of writing, timestamps within RPM archives are not normalised, so bit-for-bit identical reproducible builds are not possible. Some in-toto provenance files published for SUSE s SLE-15-SP4 as one result of the SLSA level 4 effort. Old binaries are not rebuilt, so only new builds (e.g. maintenance updates) have this metadata added. Lastly, Bernhard M. Wiedemann posted his usual monthly openSUSE reproducible builds status report.

diffoscope diffoscope is our in-depth and content-aware diff utility. Not only can it locate and diagnose reproducibility issues, it can provide human-readable diffs from many kinds of binary formats. This month, Chris Lamb prepared and uploaded versions `215`, `216` and `217` to Debian unstable. Chris Lamb also made the following changes:

New features:

Print profile output if we were called with `--profile` and we were killed via a `TERM` signal. This should help in situations where diffoscope is terminated due to some sort of timeout. [ ]

Support both PyPDF 1.x and 2.x. [ ]

Bug fixes:

Also catch `IndexError` exceptions (in addition to `ValueError`) when parsing `.pyc` files. (#1012258)

Correct the logic for supporting different versions of the `argcomplete` module. [ ]

Output improvements:

Don t leak the (likely-temporary) pathname when comparing PDF documents. [ ]

Logging improvements:

Update test fixtures for GNU readelf 2.38 (now in Debian unstable). [ ][ ]

Be more specific about the minimum required version of `readelf` (ie. binutils), as it appears that this patch level version change resulted in a change of output, not the minor version. [ ]

Use our `@skip_unless_tool_is_at_least` decorator (NB. `at_least`) over `@skip_if_tool_version_is` (NB. `is`) to fix tests under Debian stable. [ ]

Emit a warning if/when we are handling a UNIX `TERM` signal. [ ]

Codebase improvements:

Clarify in what situations the main `finally` block gets called with respect to `TERM` signal handling. [ ]

Clarify control flow in the `diffoscope.profiling` module. [ ]

Correctly package the `scripts/` directory. [ ]

In addition, Edward Betts updated a broken link to the RSS on the diffoscope homepage and Vagrant Cascadian updated the diffoscope package in GNU Guix [ ][ ][ ].

Upstream patches The Reproducible Builds project detects, dissects and attempts to fix as many currently-unreproducible packages as possible. We endeavour to send all of our patches upstream where appropriate. This month, we wrote a large number of such patches, including:

Bernhard M. Wiedemann:

`build-compare` caused a regression for a few days.

`python-fasttext` (CPU-related issue).

Chris Lamb:

#1012614 filed against `node-dommatrix`.

#1012766 filed against `rtpengine`.

#1012790 filed against `sphinxcontrib-mermaid`.

#1012792 filed against `yaru-theme`.

#1012836 filed against `mapproxy` (forwarded upstream).

#1013257 filed against `libxsmm`.

#1014041 filed against `yt-dlp` (forwarded upstream).

#891263 was filed against puppet in February 2018 and the patch was finally proposed for inclusion upstream.

Testing framework The Reproducible Builds project runs a significant testing framework at tests.reproducible-builds.org, to check packages and other artifacts for reproducibility. This month, the following changes were made:

Holger Levsen:

Add a package set for packages that use the R programming language [ ] as well as one for Rust [ ].

Improve package set matching for Python [ ] and font-related [ ] packages.

Install the `lz4`, `lzop` and `xz-utils` packages on all nodes in order to detect running kernels. [ ]

Improve the cleanup mechanisms when testing the reproducibility of Debian Live images. [ ][ ]

In the automated node health checks, deprioritise the generic kernel warning . [ ]

Roland Clobus (Debian Live image reproducibility):

Add various maintenance jobs to the Jenkins view. [ ]

Cleanup old workspaces after 24 hours. [ ]

Cleanup temporary workspace and resulting directories. [ ]

Implement a number of fixes and improvements around publishing files. [ ][ ][ ]

Don t attempt to preserve the file timestamps when copying artifacts. [ ]

And finally, node maintenance was also performed by Mattia Rizzolo [ ].

Mailing list and website On our mailing list this month:

David Wheeler started a thread stating his desire that reproducible builds and GitBOM are able to work together simultaneously. David first describes the goals of both GitBOM and reproducibility, outlines the potential problems and even outlines a number of prospective solutions.

In a similar vein, David Wheeler also posted about the problems with Profile-Guided Optimisation (PGO) in relation to reproducible builds.

Roland Clobus copied in our mailing list with a question about whether enabling link-time optimisations (LTO) in Debian as a whole might cause reproducibility problems.

Mattia Rizzolo posted a request for assistance regarding the translations of our website.

Lastly, Chris Lamb updated the main Reproducible Builds website and documentation in a number of small ways, but primarily published an interview with Hans-Christoph Steiner of the F-Droid project. Chris Lamb also added a Coffeescript example for parsing and using the `SOURCE_DATE_EPOCH` environment variable [ ]. In addition, Sebastian Crane very-helpfully updated the screenshot of salsa.debian.org s request access button on the How to join the Salsa group. [ ]

Contact If you are interested in contributing to the Reproducible Builds project, please visit our Contribute page on our website. However, you can get in touch with us via:

IRC: `#reproducible-builds` on `irc.oftc.net`.

Twitter: @ReproBuilds

Mailing list: `rb-general@lists.reproducible-builds.org`

5 March 2022

Reproducible Builds: Reproducible Builds in February 2022

Welcome to the February 2022 report from the Reproducible Builds project. In these reports, we try to round-up the important things we and others have been up to over the past month. As ever, if you are interested in contributing to the project, please visit our Contribute page on our website.

Jiawen Xiong, Yong Shi, Boyuan Chen, Filipe R. Cogo and Zhen Ming Jiang have published a new paper titled Towards Build Verifiability for Java-based Systems (PDF). The abstract of the paper contains the following:

Various efforts towards build verifiability have been made to C/C++-based systems, yet the techniques for Java-based systems are not systematic and are often specific to a particular build tool (eg. Maven). In this study, we present a systematic approach towards build verifiability on Java-based systems.

GitBOM is a flexible scheme to track the source code used to generate build artifacts via Git-like unique identifiers. Although the project has been active for a while, the community around GitBOM has now started running weekly community meetings.

The paper Chris Lamb and Stefano Zacchiroli is now available in the March/April 2022 issue of IEEE Software. Titled Reproducible Builds: Increasing the Integrity of Software Supply Chains (PDF), the abstract of the paper contains the following:

We first define the problem, and then provide insight into the challenges of making real-world software build in a reproducible manner-this is, when every build generates bit-for-bit identical results. Through the experience of the Reproducible Builds project making the Debian Linux distribution reproducible, we also describe the affinity between reproducibility and quality assurance (QA).

In openSUSE, Bernhard M. Wiedemann posted his monthly reproducible builds status report.

On our mailing list this month, Thomas Schmitt started a thread around the SOURCE_DATE_EPOCH specification related to formats that cannot help embedding potentially timezone-specific timestamp. (Full thread index.)

The Yocto Project is pleased to report that it s core metadata (OpenEmbedded-Core) is now reproducible for all recipes (100% coverage) after issues with newer languages such as Golang were resolved. This was announced in their recent Year in Review publication. It is of particular interest for security updates so that systems can have specific components updated but reducing the risk of other unintended changes and making the sections of the system changing very clear for audit. The project is now also making heavy use of equivalence of build output to determine whether further items in builds need to be rebuilt or whether cached previously built items can be used. As mentioned in the article above, there are now public servers sharing this equivalence information. Reproducibility is key in making this possible and effective to reduce build times/costs/resource usage.

diffoscope diffoscope is our in-depth and content-aware diff utility. Not only can it locate and diagnose reproducibility issues, it can provide human-readable diffs from many kinds of binary formats. This month, Chris Lamb prepared and uploaded versions `203`, `204`, `205` and `206` to Debian unstable, as well as made the following changes to the code itself:

Bug fixes:

Fix a `file(1)`-related regression where Debian `.changes` files that contained non-ASCII text were not identified as such, therefore resulting in seemingly arbitrary packages not actually comparing the nested files themselves. The non-ASCII parts were typically in the `Maintainer` or in the changelog text. [ ][ ]

Fix a regression when comparing directories against non-directories. [ ][ ]

If we fail to scan using `binwalk`, return `False` from `BinwalkFile.recognizes`. [ ]

If we fail to import `binwalk`, don t report that we are missing the Python `rpm` module! [ ]

Testsuite improvements:

Add a test for recent `file(1)` issue regarding `.changes` files. [ ]

Use our `assert_diff` utility where we can within the `test_directory.py` set of tests. [ ]

Don t run our `binwalk`-related tests as root or `fakeroot`. The latest version of `binwalk` has some new security protection against this. [ ]

Codebase improvements:

Drop the `_PATH` suffix from module-level globals that are not paths. [ ]

Tidy some control flow in `Difference._reverse_self`. [ ]

Don t print a warning to the console regarding `NT_GNU_BUILD_ID` changes. [ ]

In addition, Mattia Rizzolo updated the Debian packaging to ensure that `diffoscope` and `diffoscope-minimal` packages have the same version. [ ]

Debian-related updates Vagrant Cascadian wrote to the `debian-devel` mailing list after noticing that the `binutils` source package contained unreproducible logs in one of its binary packages. Vagrant expanded the discussion to one about all kinds of build metadata in packages and outlines a number of potential solutions that support reproducible builds and arbitrary metadata. Vagrant also started a discussion on `debian-devel` after identifying a large number of packages that embed build paths via RPATH when building with CMake, including a list of packages (grouped by Debian maintainer) affected by this issue. Maintainers were requested to check whether their package still builds correctly when passing the `-DCMAKE_BUILD_RPATH_USE_ORIGIN=ON` directive. On our mailing list this month, kpcyrd announced the release of rebuilderd-debian-buildinfo-crawler a tool to parse the `Packages.xz` Debian package index file, attempts to discover the right `.buildinfo` file from buildinfos.debian.net and outputs it in a format that can be understood by rebuilderd. The tool, which is available on GitHub, solves a problem regarding correlating Debian version numbers with their builds. bauen1 provided two patches for debian-cd, the software used to make Debian installer images. This involved passing `--invariant` and `-i deb00001` to `mkfs.msdos(8)` and avoided embedding timestamps into the gzipped `Packages` and `Translations` files. After some discussion, the patches in question were merged and will be included in debian-cd version 3.1.36. Roland Clobus wrote another in-depth status update about status of live Debian images, summarising the current situation that all major desktops build reproducibly with bullseye, bookworm and sid . The `python3.10` package was uploaded to Debian by doko, fixing an issue where [`.pyc` files were not reproducible because the elements in `frozenset` data structures were not ordered reproducibly. This meant that to creating a bit-for-bit reproducible Debian chroot which included `.pyc` files was not reproducible. As of writing, the only remaining unreproducible parts of a `standard` chroot is `man-db`, but Guillem Jover has a patch for `update-alternatives` which will likely be part of the next release of `dpkg`. Elsewhere in Debian, 139 reviews of Debian packages were added, 29 were updated and 17 were removed this month adding to our knowledge about identified issues. A large number of issue types have been updated too, including the addition of `captures_kernel_variant`, `erlang_escript_file`, `captures_build_path_in_r_rdb_rds_databases`, `captures_build_path_in_vo_files_generated_by_coq` and `build_path_in_vo_files_generated_by_coq`.

Website updates There were quite a few changes to the Reproducible Builds website and documentation this month as well, including:

Chris Lamb:

Considerably rework the Who is involved? page. [ ][ ]

Move the `contributors.sh` Bash/shell script into a Python script. [ ][ ][ ]

Daniel Shahaf:

Try a different Markdown footnote content syntax to work around a rendering issue. [ ][ ][ ]

Holger Levsen:

Make a huge number of changes to the Who is involved? page, including pre-populating a large number of contributors who cannot be identified from the metadata of the website itself. [ ][ ][ ][ ][ ]

Improve linking to sponsors in sidebar navigation. [ ]

drop sponsors paragraph as the navigation is clearer now. [ ]

Add Mullvad VPN as a bronze-level sponsor . [ ][ ]

Vagrant Cascadian:

Remove a stray parenthesis from the Who is involved? page. [ ]

Upstream patches The Reproducible Builds project attempts to fix as many currently-unreproducible packages as possible. February s patches included the following:

Bernhard M. Wiedemann:

`btop` (sort-related issue)

`complexity` (date)

`giac` (update the version with upstreamed date patch)

`htcondor` (use CMake timestamp)

`libint` (`readdir` system call related)

`libnet` (date-related issue)

`librime-lua` (sort filesystem ordering)

`linux_logo` (sort-related issue)

`micro-editor` (date-related issue)

`openvas-smb` (date-related issue)

`ovmf` (sort-related issue)

`paperjam` (date-related issue)

`python-PyQRCode` (date-related issue)

`quimb` (single-CPU build failure)

`radare2` (Meson date/time-related issue)

`radare2` (Rework `SOURCE_DATE_EPOCH` usage to be portable)

`siproxd` (date, with Sebastian Kemper + follow-up

`xonsh` (Address Space Layout Randomisation-related issue)

`xsnow` (date & `tar(1)`-related issue)

`zip` (toolchain issue related to filesystem ordering)

Chris Lamb:

#1005029 filed against `ltsp` (forwarded upstream).

#1005197 filed against `pcmemtest`.

#1005825 filed against `hatchling`.

#1005826 filed against `mpl-sphinx-theme` (forwarded upstream)

#1005827 filed against `gap-hapcryst`.

#1005901 filed against `tree-puzzle`.

#1005954 filed against `jcabi-aspects`.

#1005955 filed against `paper-icon-theme`.

Roland Clobus:

#1006358 filed against `libxmlb`.

Vagrant Cascadian:

#1005408 filed against `wcwidth`.

#1005420 filed against `xir`.

#1005421 filed against `xir`.

#1005726 filed against `ruby-github-markup`.

#1005727 filed against `ruby-tioga`.

#1005792 filed against `btop`.

#1005793 filed against `libadwaita-1`.

#1005794 filed against `snibbetracker`.

#1006252 filed against `cctbx`.

#1006254 filed against `mdnsd`.

#1006256 filed against `gmerlin`.

#1006302 filed against `beav`.

#1006385 filed against `krita`.

#1006407 filed against `qt6-base`.

#1006455 filed against `onevpl-intel-gpu`.

#1006471 filed against `ruby3.0`.

#1006473 filed against `nix`.

#1006474 filed against `foma`.

#1006476 filed against `ruby3.0`.

Testing framework The Reproducible Builds project runs a significant testing framework at tests.reproducible-builds.org, to check packages and other artifacts for reproducibility. This month, the following changes were made:

Daniel Golle:

Update the OpenWrt configuration to not depend on the host LLVM, adding lines to the `.config` seed to build LLVM for eBPF from source. [ ]

Preserve more OpenWrt-related build artifacts. [ ]

Holger Levsen:

Temporary use a different Git tree when building OpenWrt as our tests had been broken since September 2020. This was reverted after the patch in question was accepted by Paul Spooren into the canonical `openwrt.git` repository the next day.

Various improvements to debugging OpenWrt reproducibility. [ ][ ][ ][ ][ ]

Ignore `useradd` warnings when building packages. [ ]

Update the script to powercycle `armhf` architecture nodes to add a hint to where nodes named `virt-`. [ ]

Update the node health check to also fix failed `logrotate` and `man-db` services. [ ]

Mattia Rizzolo:

Update the website job after `contributors.sh` script was rewritten in Python. [ ]

Make sure to set the `DIFFOSCOPE` environment variable when available. [ ]

Vagrant Cascadian:

Various updates to the diffoscope* timeouts. [ ][ ][ ]

Node maintenance was also performed by Holger Levsen [ ] and Vagrant Cascadian [ ].

Finally If you are interested in contributing to the Reproducible Builds project, please visit our Contribute page on our website. However, you can get in touch with us via:

IRC: `#reproducible-builds` on `irc.oftc.net`.

Twitter: @ReproBuilds

Mailing list: `rb-general@lists.reproducible-builds.org`

8 October 2021

Chris Lamb: Reproducible Builds: Increasing the Integrity of Software Supply Chains (2021)

I didn't blog about it at the time, but a paper I co-authored with Stefano Zacchiroli was accepted by IEEE Software in April of this year. Titled Reproducible Builds: Increasing the Integrity of Software Supply Chains, the abstract of the paper is as follows:

Although it is possible to increase confidence in Free and Open Source Software (FOSS) by reviewing its source code, trusting code is not the same as trusting its executable counterparts. These are typically built and distributed by third-party vendors with severe security consequences if their supply chains are compromised. In this paper, we present reproducible builds, an approach that can determine whether generated binaries correspond with their original source code. We first define the problem and then provide insight into the challenges of making real-world software build in a "reproducible" manner that is, when every build generates bit-for-bit identical results. Through the experience of the Reproducible Builds project making the Debian Linux distribution reproducible, we also describe the affinity between reproducibility and quality assurance (QA).

The full text of the paper can be found in PDF format and should appear, with an alternative layout, within a forthcoming issue of the physical IEEE Software magazine.

15 April 2021

Martin Michlmayr: ledger2beancount 2.6 released

I released version 2.6 of ledger2beancount, a ledger to beancount converter. Here are the changes in 2.6:

Round calculated total if needed for price==cost comparison
Add narration_tag config variable to set narration from metadata
Retain unconsummated payee/payer metadata
Ensure UTF-8 output and assume UTF-8 input
Document UTF-8 issue on Windows systems
Add option to move posting-level tags to the transaction itself
Add support for the alias sub-directive of account declarations
Add support for the payee sub-directive of account declarations
Support configuration file called .ledger2beancount.yaml
Fix uninitialised value warning in hledger mode
Print warning if account in assertion has sub-accounts
Set commodity for commodity-less balance assertion
Expand path name of beancount_header config variable
Document handling of buckets
Document pre- and post-processing examples
Add Dockerfile to create Docker image

Thanks to Alexander Baier, Daniele Nicolodi, and GitHub users bratekarate, faaafo and mefromthepast for various bug reports and other input. Thanks to Dennis Lee for adding a Dockerfile and to Vinod Kurup for fixing a bug. Thanks to Stefano Zacchiroli for testing. You can get ledger2beancount from GitHub.

26 March 2021

Daniel Lange: The Stallman wars

So, 2021 isn't bad enough yet, but don't despair, people are working to fix that:

Welcome to the Stallman wars Team Cancel: https://rms-open-letter.github.io/ (repo) Team Support: https://rms-support-letter.github.io/ (repo) Current stats are:

Team Cancel:  3028 signers from 1413 individual commit authors
Team Support: 6249 signers from 5018 individual commit authors

Git shortlog (Top 10):

rms_cancel.git (Last update: 2021-04-07 15:42:33 (UTC))
  1228  Neil McGovern
   251  Joan Touzet
    86  Elana Hashman
    71  Molly de Blanc
    36  Shauna
    19  Juke
    18  Stefano Zacchiroli
    17  Alexey Mirages
    16  Devin Halladay
    14  Nader Jafari
rms_support.git (Last update: 2021-04-12 09:25:53 (UTC))
  1678  shenlebantongying
  1564  nukeop
  1550  Ivanq
   826  Victor
   746  Job Bautista
   123  nekonee
    61  Victor Gridnevsky
    38  Patrick Spek
    25  Borys Kabakov
    17  KIM Taeyeob

(last updated 2021-04-12 09:26:15 (UTC)) Technical info:
Signers are counted from their "Signed / Individuals" sections. Commits are counted with git shortlog -s.
Team Cancel also has organizational signatures with Mozilla, Suse and X.Org being among the notable signatories. Debian is in the process of running a GR to join (or not join) that list. The 16 original signers of the Cancel petition are added in their count. Neil McGovern, Juke and shenlebantongying need .mailmap support as they have committed with different names. Further reading:

An introductory Ars Technica article in case you wonder what this all is about.
Debian vote mailing-list: March 2021, April 2021
NYT Magazine on the history of cancel culture
Ed Santos' commentary and analysis

31 December 2020

Chris Lamb: Free software activities in December 2020

Here is my monthly update covering what I have been doing in the free software world during December 2020 (previous month):

Reviewed and merged a contribution from Peter Law to my django-cache-toolbox library for Django-based web applications, including explicitly requiring that cached relations are primary keys (#23) and improving the example in the README (#25).

I took part in an interview with Vladimir Bejdo, an intern at the Software Freedom Conservancy, in order to talk about the Reproducible Builds project, my participation in software freedom, the importance of reproducibility in software development, and to have a brief discussion on the issues facing free software as a whole. The full interview can be found on Conservancy's webpages.

As part of my duties of being on the board of directors of the Open Source Initiative, I attended its monthly meeting and participated in various licensing and other related discussions occurring on the internet. Unfortunately, I could not attend the parallel meeting for Software in the Public Interest this month.

Reproducible Builds One of the original promises of open source software is that distributed peer review and transparency of process results in enhanced end-user security. However, whilst anyone may inspect the source code of free and open source software for malicious flaws, almost all software today is distributed as pre-compiled binaries. This allows nefarious third-parties to compromise systems by injecting malicious code into ostensibly secure software during the various compilation and distribution processes. The motivation behind the Reproducible Builds effort is to ensure no flaws have been introduced during this compilation process by promising identical results are always generated from a given source, thus allowing multiple third-parties to come to a consensus on whether a build was compromised. This month, I:

Submitted a draft academic paper to IEEE Software. The article (co-written by Stefano Zacchiroli) is aimed a fairly general audience. It first defines the overal problem and then provides insight into the challenges of actually making real-world software reproducible. It then outlines the experiences of the Reproducible Builds project in making large-scale software collections/supply-chains/ecosystems reproducible and concludes by describing the affinity between reproducibility efforts and quality assurance.
Kept isdebianreproducibleyet.com up to date. [...]
Submitted 11 patches in Debian to fix specific reproducibility issues in circlator, dvbstreamer, eric, jbbp, knot-resolver, libjs-qunit, mail-expire, osmo-mgw, python-pyramid, pyvows & sayonara.
Categorised a huge number of packages and issues in the Reproducible Builds 'notes' repository.
For disorderfs (our FUSE-based filesystem that deliberately introduces non-determinism into directory system calls in order to flush out reproducibility issues), I made the following changes:
- Add support for testing on Salsa's CI system. [...][...][...][...]
- Added a quick benchmark. [...]
Drafted, published and publicised our monthly report, as well managed the project's various social media accounts.
Contributed to a discussion about the recent 'SolarWinds' attack. [...]

I also made a large number of changes to the main Reproducible Builds website and documentation, including applying a typo fix from Roland Clobus [...], fixed the draft detection logic (#28), added more academic articles to our list [...] and corrected a number of grammar issues [...][...].

I also made the following changes to diffoscope, our in-depth and content-aware diff utility that can locate and diagnose reproducibility issues, including releasing version 163:

New features & bug fixes:
- Normalise ret to retq in objdump output in order to support multiple versions of GNU binutils. (#976760)
- Don't show any progress indicators when running zstd. (#226)
- Correct the grammatical tense in the --debug log output. [...]
Codebase improvements:
- Update the debian/copyright file to match the copyright notices in the source tree. (#224)
- Update various years across the codebase in .py copyright headers. [...]
- Rewrite the filter routine that post-processes the output from readelf(1). [...]
- Remove unnecessary PEP 263 encoding header lines; unnecessary after PEP 3120. [...]
- Use minimal instead of basic as a variable name to match the underlying package name. [...]
- Use pprint.pformat in the JSON comparator to serialise the differences from jsondiff. [...]

Debian Uploads

python-django:
- 2.2.17-2 Fix compatibility with GNU gettext version 0.21. (#978263)
- 3.1.4-1 New upstream bugfix release.
redis:
- 6.2~rc1-1 New upstream RC release.
- 6.2~rc1-2 Enable systemd support by compiling against the libsystemd-dev package. (#977852)
- 6.2~rc1-3 Specify --supervised systemd now that we specify Type=notify to prevent service startup failure under systemd.
mtools (4.0.26-1) New upstream release.

I also sponsored an upload of adminer (4.7.8-2) on behalf of Alexandre Rossi and performed two QA uploads of sendfile (2.1b.20080616-7 and 2.1b.20080616-8) to make the build the build reproducible (#776938) and to fix a number of other unrelated issues. Debian LTS This month I have worked 18 hours on Debian Long Term Support (LTS) and 12 hours on its sister Extended LTS project.

Investigated and triaged: awstats, imagemagick, node-ini, openexr, openssl1.0, p11-kit, pypy, python-py, sqlite3, sympa, etc.
Frontdesk duties, responding to user/developer questions, reviewing others' packages, participating in mailing list discussions, etc.
Issued DLA 2477-1 for the Jupyter Notebook interactive notebook system, where a maliciously-crafted link could redirect the browser to a malicious/spoofed website. (CVE-2020-26215)
Issued DLA 2491-1 and ELA-333-1 to fix two issues in OpenEXR, a set of tools to manipulate OpenEXR image files, often used in the computer-graphics industry for visual effects and animation. (CVE-2020-16588 & CVE-2020-16589)
Issued DLA 2503-1 as it was discovered that there was an issue in node-ini, an .ini configuration file format parser/serialiser for Node.js, where an application could be exploited by a malicious input file.

You can find out more about the Debian LTS project via the following video:

30 November 2020

Chris Lamb: Free software activities in November 2020

Here is my monthly update covering what I have been doing in the free software world during November 2020 (previous month):

Merged a pull request from Jens Nistler for django-slack (my library which provides a convenient wrapper between projects using the Django and the Slack chat platform) to make it compatible with Celery version 5. [...]

Created a pull request for the Emscripten LLVM-to-WebAssembly compiler to make the Document Object Model codes reproducible. [...]
Added a link to my deprecated static analyser for Django to point to Richard Tier's Django Doctor. [...]

As a board member of both the Open Source Initiative and Software in the Public Interest I attended their respective monthly meetings, including the bi-annual OSI multi-day "face-to-face" meetings.

Made further progress on an academic paper in collaboration with Stefano Zacchiroli that details the theoretical and practical workings of the reproducible builds distributed consensus scheme.
Created an upstream pull request for the Emscripten LLVM-to-WebAssembly compiler to make DOM codes reproducible (with a number of followups). [...]
In Debian:
- I kept isdebianreproducibleyet.com up to date. [...]
- Submitted 11 patches to fix specific reproducibility issues in amavisd-milter, armagetronad, emscripten, less.js, metakernel, open-iscsi, os-autoinst, python-biom-format, python-pairix, requirejs & sympow.
- Sent a large number of followups to old bugs that had not been updated for some time (for example, #968700, etc.)
Categorised a large number of packages and issues in the Reproducible Builds "notes" repository, including categorising three new toolchain issues: build_path_captured_by_pyuic5, build_path_captured_by_octave & build_path_captured_by_nim.
Drafted, published and publicised last month's report as well maintained the Project's various social media accounts.

Updated the main Reproducible Builds website and documentation to clarify that SOURCE_DATE_EPOCH is not Debian specific [...] and make a number of misc cosmetic changes [...][...].

I also made the following changes to diffoscope:

Improvements:
- Move the slightly-confusing behaviour if a single file is passed to diffoscope on the command-line to a new --load-existing-diff command. [...]
- Ensure the new diffoscope-minimal package that was introduced by Mattia Rizzolo has a different short description from the primary diffoscope one. [...]
- Refresh the long and short descriptions of all of the Debian packages. [...]
Bug fixes:
- Don't depend on radare2 in the Debian 'autopkgtests' as it will not be in bullseye due to security considerations. (#975313)
- Avoid some incorrectly-formatted error messages. This was caused by diffoscope raising an artificial CalledProcessError exception in a generic handler. [...]
Codebase improvements:
- Add a comment regarding Java tests to aid diffoscope contributors who are not using Debian [...] and don't use the old-style super(...) call [...].

Debian I performed the following uploads to the Debian Linux distribution this month:

python-django (2.2.17-1 & 3.1.3-1) New upstream releases.
memcached (1.6.9+dfsg-1) New upstream release.

lintian (2.101.0, 2.102.0, 2.103.0 & 2.104.0) New upstream releases.
xtrlock (2.14) Mark an autopkgtest as 'superficial'. (#974491)
bfs (2.1-1) New upstream release.
splint (3.1.2+dfsg-3) Re-upload a previous QA upload of mine (3.1.2+dfsg-2) to ensure the package's transition to the testing distribution. (#974872)

I also filed a release-critical bug against the minidlna package which could not be successfully purged from the system without reporting a cannot remove '/var/log/minidlna' error. (#975372)

Debian LTS This month I have worked 18 hours on Debian Long Term Support (LTS) and 12 hours on its sister Extended LTS project, including:

Investigated and triaged codemirror-js, glibc, jupyter-notebook, krb5, libhibernate3-java, raptor2, spice-vdagent & webcit.
'Frontdesk' duties, participating in mailing list discussions, attending the monthly meeting and organising LTS and ELTS frontdesk allocations for 2021.
Issued DLA 2433-1 for the Bouncy Castle cryptography library to prevent an issue where attackers could obtain sensitive information due to observable differences in its responses to invalid input. (CVE-2020-26939)
Issued DLA 2434-1 for the GNOME display manager (gdm3) where gdm3 detecting any users may have caused gdm3 to launch the initial system setup, permitting the creation of new users with superuser capabilities. (CVE-2020-16125)
Issued DLA 2436-1 for the sddm display manager. Here, local and unprivileged users could create a connection to the X server. (CVE-2020-28049)
Issued DLA 2437-1 & ELA-308-1 as it was discovered that there was a denial of service vulnerability in the MIT Kerberos network authentication system, krb5. The lack of a limit in an ASN.1 decoder could lead to infinite recursion and allow an attacker to overrun the stack and cause the process to crash. (CVE-2020-28196)
Issued DLA 2438-1 and ELA-309-1 to prevent two heap overflow vulnerabilities in raptor2, a set of parsers for Resource Description Framework (RDF) files used in LibreOffice and other applications. (CVE-2017-18926)
Issued DLA 2465-1 to correct filename sanitisation issues in a utility used to access PHP Pear, a distribution system for reusable PHP components. (CVE-2020-28948 & CVE-2020-28949)

You can find out more about the Debian LTS project via the following video:

3 November 2020

Martin Michlmayr: ledger2beancount 2.5 released

I released version 2.5 of ledger2beancount, a ledger to beancount converter. Here are the changes in 2.5:

Don't create negative cost for lot without cost
Support complex implicit conversions
Handle typed metadata with value 0 correctly
Set per-unit instead of total cost when cost is missing from lot
Support commodity-less amounts
Convert transactions with no amounts or only 0 amounts to notes
Fix parsing of transaction notes
Keep tags in transaction notes on same line as transaction header
Add beancount config options for non-standard root names automatically
Fix conversion of fixated prices to costs
Fix removal of price when price==cost but when they use different number formats
Fix removal of price when price==cost but per-unit and total notation mixed
Fix detection of tags and metadata after posting/aux date
Use D directive to set default commodity for hledger
Improve support for postings with commodity-less amounts
Allow empty comments
Preserve leading whitespace in comments in postings and transaction headers
Preserve indentation for tags and metadata
Preserve whitespace between amount and comment
Refactor code to use more data structures
Remove dependency on Config::Onion module

Thanks to input from Remco R nders, Yuri Khan, and Thierry. Thanks to Stefano Zacchiroli and Kirill Goncharov for testing my changes. You can get ledger2beancount from GitHub

27 July 2020

Martin Michlmayr: ledger2beancount 2.4 released

I released version 2.4 of ledger2beancount, a ledger to beancount converter. There are two notable changes in this release:

I fixed two regressions introduced in the last release. Sorry about the breakage!
I improved support for hledger. I believe all syntax differences in hledger are supported now.

Here are the changes in 2.4:

Fix regressions introduced in version 2.3
- Handle price directives with comments
- Don't assume implicit conversion when price is on second posting
Improve support for hledger
- Fix parsing of hledger tags
- Support commas as decimal markers
- Support digit group marks through commodity and D directives
- Support end aliases directive
- Support regex aliases
- Recognise total balance assertions
- Recognise sub-account balance assertions
Add support for define directive
Convert all uppercase metadata tags to all lowercase
Improve handling of ledger lots without cost
Allow transactions without postings
Fix parsing issue in commodity declarations
Support commodities that contain quotation marks
Add --version option to show version
Document problem of mixing apply and include

Thanks to Kirill Goncharov for pointing out one regressions, to Taylor R Campbell for for a patch, to Stefano Zacchiroli for some input, and finally to Simon Michael for input on hledger! You can get ledger2beancount from GitHub

26 June 2020

Martin Michlmayr: ledger2beancount 2.3 released

I released version 2.3 of ledger2beancount, a ledger to beancount converter. There are three notable changes with this release:

Performance has significantly improved. One large, real-world test case has gone from around 160 seconds to 33 seconds. A smaller test case has gone from 11 seconds to ~3.5 seconds.
The documentation is available online now (via Read the Docs).
The repository has moved to the beancount GitHub organization.

Here are the changes in 2.3:

Improve speed of ledger2beancount significantly
Improve parsing of postings for accuracy and speed
Improve support for inline math
Handle lots without cost
Fix parsing of lot notes followed by a virtual price
Add support for lot value expressions
Make parsing of numbers more strict
Fix behaviour of dates without year
Accept default ledger date formats without configuration
Fix implicit conversions with negative prices
Convert implicit conversions in a more idiomatic way
Avoid introducing trailing whitespace with hledger input
Fix loading of config file
Skip ledger directive import
Convert documentation to mkdocs

Thanks to Colin Dean for some feedback. Thanks to Stefano Zacchiroli for prompting me into investigating performance issues (and thanks to the developers of the Devel::NYTProf profiler). You can get ledger2beancount from GitHub

6 April 2020

Martin Michlmayr: ledger2beancount 2.1 released

I released version 2.1 of ledger2beancount, a ledger to beancount converter. Here are the changes in 2.1:

Handle postings with posting dates and comments but no amount
Show transactions with only one posting (without bucket)
Adding spacing between automatic declarations
Preserve preliminary info at the top

You can get ledger2beancount from GitHub. Thanks to Thierry (thdox) for reporting a bug and for fixing some typos in the documentation. Thanks to Stefano Zacchiroli for some good feedback.

23 August 2017

Antoine Beaupr : The supposed decline of copyleft

At DebConf17, John Sullivan, the executive director of the FSF, gave a talk on the supposed decline of the use of copyleft licenses use free-software projects. In his presentation, Sullivan questioned the notion that permissive licenses, like the BSD or MIT licenses, are gaining ground at the expense of the traditionally dominant copyleft licenses from the FSF. While there does seem to be a rise in the use of permissive licenses, in general, there are several possible explanations for the phenomenon.

When the rumor mill starts Sullivan gave a recent example of the claim of the decline of copyleft in an article on Opensource.com by Jono Bacon from February 2017 that showed a histogram of license usage between 2010 and 2017 (seen below).

From that, Bacon elaborates possible reasons for the apparent decline of the GPL. The graphic used in the article was actually generated by Stephen O'Grady in a January article, The State Of Open Source Licensing, which said:
In Black Duck's sample, the most popular variant of the GPL version 2 is less than half as popular as it was (46% to 19%). Over the same span, the permissive MIT has gone from 8% share to 29%, while its permissive cousin the Apache License 2.0 jumped from 5% to 15%.
Sullivan, however, argued that the methodology used to create both articles was problematic. Neither contains original research: the graphs actually come from the Black Duck Software "KnowledgeBase" data, which was partly created from the old Ohloh web site now known as Open Hub. To show one problem with the data, Sullivan mentioned two free-software projects, GNU Bash and GNU Emacs, that had been showcased on the front page of Ohloh.net in 2012. On the site, Bash was (and still is) listed as GPLv2+, whereas it changed to GPLv3 in 2011. He also claimed that "Emacs was listed as licensed under GPLv3-only, which is a license Emacs has never had in its history", although I wasn't able to verify that information from the Internet archive. Basically, according to Sullivan, "the two projects featured on the front page of a site that was using [the Black Duck] data set were wrong". This, in turn, seriously brings into question the quality of the data:
I reported this problem and we'll continue to do that but when someone is not sharing the data set that they're using for other people to evaluate it and we see glimpses of it which are incorrect, that should give us a lot of hesitation about accepting any conclusion that comes out of it.
Reproducible observations are necessary to the establishment of solid theories in science. Sullivan didn't try to contact Black Duck to get access to the database, because he assumed (rightly, as it turned out) that he would need to "pay for the data under terms that forbid you to share that information with anybody else". So I wrote Black Duck myself to confirm this information. In an email interview, Patrick Carey from Black Duck confirmed its data set is proprietary. He believes, however, that through a "combination of human and automated techniques", Black Duck is "highly confident at the accuracy and completeness of the data in the KnowledgeBase". He did point out, however, that "the way we track the data may not necessarily be optimal for answering the question on license use trend" as "that would entail examination of new open source projects coming into existence each year and the licenses used by them". In other words, even according to Black Duck, its database may not be useful to establish the conclusions drawn by those articles. Carey did agree with those conclusions intuitively, however, saying that "there seems to be a shift toward Apache and MIT licenses in new projects, though I don't have data to back that up". He suggested that "an effective way to answer the trend question would be to analyze the new projects on GitHub over the last 5-10 years." Carey also suggested that "GitHub has become so dominant over the recent years that just looking at projects on GitHub would give you a reasonable sampling from which to draw conclusions".

Indeed, GitHub published a report in 2015 that also seems to confirm MIT's popularity (45%), surpassing copyleft licenses (24%). The data is, however, not without its own limitations. For example, in the above graph going back to the inception of GitHub in 2008, we see a rather abnormal spike in 2013, which seems to correlate with the launch of the choosealicense.com site, described by GitHub as "our first pass at making open source licensing on GitHub easier". In his talk, Sullivan was critical of the initial version of the site which he described as biased toward permissive licenses. Because the GitHub project creation page links to the site, Sullivan explained that the site's bias could have actually influenced GitHub users' license choices. Following a talk from Sullivan at FOSDEM 2016, GitHub addressed the problem later that year by rewording parts of the front page to be more accurate, but that any change in license choice obviously doesn't show in the report produced in 2015 and won't affect choices users have already made. Therefore, there can be reasonable doubts that GitHub's subset of software projects may not actually be that representative of the larger free-software community.

In search of solid evidence So it seems we are missing good, reproducible results to confirm or dispel these claims. Sullivan explained that it is a difficult problem, if only in the way you select which projects to analyze: the impact of a MIT-licensed personal wiki will obviously be vastly different from, say, a GPL-licensed C compiler or kernel. We may want to distinguish between active and inactive projects. Then there is the problem of code duplication, both across publication platforms (a project may be published on GitHub and SourceForge for example) but also across projects (code may be copy-pasted between projects). We should think about how to evaluate the license of a given project: different files in the same code base regularly have different licenses often none at all. This is why having a clear, documented and publicly available data set and methodology is critical. Without this, the assumptions made are not clear and it is unreasonable to draw certain conclusions from the results. It turns out that some researchers did that kind of open research in 2016 in a paper called "The Debsources Dataset: Two Decades of Free and Open Source Software" [PDF] by Matthieu Caneill, Daniel M. Germ n, and Stefano Zacchiroli. The Debsources data set is the complete Debian source code that covers a large history of the Debian project and therefore includes thousands of free-software projects of different origins. According to the paper:
The long history of Debian creates a perfect subject to evaluate how FOSS licenses use has evolved over time, and the popularity of licenses currently in use.
Sullivan argued that the Debsources data set is interesting because of its quality: every package in Debian has been reviewed by multiple humans, including the original packager, but also by the FTP masters to ensure that the distribution can legally redistribute the software. The existence of a package in Debian provides a minimal "proof of use": unmaintained packages get removed from Debian on a regular basis and the mere fact that a piece of software gets packaged in Debian means at least some users found it important enough to work on packaging it. Debian packagers make specific efforts to avoid code duplication between packages in order to ease security maintenance. The data set covers a period longer than Black Duck's or GitHub's, as it goes all the way back to the Hamm 2.0 release in 1998. The data and how to reproduce it are freely available under a CC BY-SA 4.0 license.

Sullivan presented the above graph from the research paper that showed the evolution of software license use in the Debian archive. Whereas previous graphs showed statistics in percentages, this one showed actual absolute numbers, where we can't actually distinguish a decline in copyleft licenses. To quote the paper again:
The top license is, once again, GPL-2.0+, followed by: Artistic-1.0/GPL dual-licensing (the licensing choice of Perl and most Perl libraries), GPL-3.0+, and Apache-2.0.
Indeed, looking at the graph, at most do we see a rise of the Apache and MIT licenses and no decline of the GPL per se, although its adoption does seem to slow down in recent years. We should also mention the possibility that Debian's data set has the opposite bias: toward GPL software. The Debian project is culturally quite different from the GitHub community and even the larger free-software ecosystem, naturally, which could explain the disparity in the results. We can only hope a similar analysis can be performed on the much larger Software Heritage data set eventually, which may give more representative results. The paper acknowledges this problem:
Debian is likely representative of enterprise use of FOSS as a base operating system, where stable, long-term and seldomly updated software products are desirable. Conversely Debian is unlikely representative of more dynamic FOSS environments (e.g., modern Web-development with micro libraries) where users, who are usually developers themselves, expect to receive library updates on a daily basis.
The Debsources research also shares methodology limitations with Black Duck: while Debian packages are reviewed before uploading and we can rely on the copyright information provided by Debian maintainers, the research also relies on automated tools (specifically FOSSology) to retrieve license information. Sullivan also warned against "ascribing reason to numbers": people may have different reasons for choosing a particular license. Developers may choose the MIT license because it has fewer words, for compatibility reasons, or simply because "their lawyers told them to". It may not imply an actual deliberate philosophical or ideological choice. Finally, he brought up the theory that the rise of non-copyleft licenses isn't necessarily at the detriment of the GPL. He explained that, even if there is an actual decline, it may not be much of a problem if there is an overall growth of free software to the detriment of proprietary software. He reminded the audience that non-copyleft licenses are still free software, according to the FSF and the Debian Free Software Guidelines, so their rise is still a positive outcome. Even if the GPL is a better tool to accomplish the goal of a free-software world, we can all acknowledge that the conversion of proprietary software to more permissive and certainly simpler licenses is definitely heading in the right direction.
[I would like to thank the DebConf organizers for providing meals for me during the conference.] Note: this article first appeared in the Linux Weekly News.

25 February 2017

Stefano Zacchiroli: Software Freedom Conservancy matching

become a Conservancy supporter by February 28th and have your donation matched Non-profits that provide project support have proven themselves to be necessary for the success and advancement of individual projects and Free Software as a whole. The Free Software Foundation (founded in 1985) serves as a home to GNU projects and a canonical list of Free Software licenses. The Open Source Initiative came about in 1998, maintaining the Open Source Definition, based on the Debian Free Software Guidelines, with affiliate members including Debian, Mozilla, and the Wikimedia Foundation. Software in the Public Interest (SPI) was created in the late 90s largely to act as a fiscal sponsor for projects like Debian, enabling it to do things like accept donations and handle other financial transactions. More recently (2006), the Software Freedom Conservancy was formed. Among other activities like serving as a fiscal sponsor, infrastructure provider, and support organization for a number of free software projects including Git, Outreachy, and the Debian Copyright Aggregation Project they protect user freedom via copyleft compliance and GPL enforcement work. Without a willingness to act when licenses are violated, copyleft has no power. Through communication, collaboration, and only as last resort litigation, the Conservancy helps everyone who uses a freedom respecting license. The Conservancy has been aggressively fundraising in order to not just continue its current operations, but expand their work, staff, and efforts. They recently launched a donation matching campaign thanks to the generosity and dedication of an anonymous donor. Everyone who joins the Conservancy as a annual Supporter by February 28th will have their donation matched. A number of us are already supporters, and hope you will join us in supporting the world of an organization that supports us.

12 February 2017

Stefano Zacchiroli: Opening the Software Heritage archive

... one API (and one FOSDEM) at a time [ originally posted on the Software Heritage blog, reposted here with minor adaptations ] Last Saturday at FOSDEM we have opened up the public API of Software Heritage, allowing to programmatically browse its archive. We posted this while I was keynoting with Roberto at FOSDEM 2017, to discuss the role Software Heritage plays in preserving the Free Software commons. To accompany the talk we released our first public API, which allows to navigate the entire content of the Software Heritage archive as a graph of connected development objects (e.g., blobs, directories, commits, releases, etc.). Over the past months we have been busy working on getting source code (with full development history) into the archive, to minimize the risk that important bits of Free/Open Sources Software that are publicly available today disappear forever from the net, due to whatever reason --- crashes, black hat hacking, business decisions, you name it. As a result, our archive is already one of the largest collections of source code in existence, spanning a GitHub mirror, injections of important Free Software collections such as Debian and GNU, and an ongoing import of all Google Code and Gitorious repositories. Up to now, however, the archive was deposit-only. There was no way for the public to access its content. While there is a lot of value in archival per se, our mission is to Collect, Preserve, and Share all the material we collect with everybody. Plus, we totally get that a deposit-only library is much less exciting than a store-and-retrieve one! Last Saturday we took a first important step towards providing full access to the content of our archive: we released version 1 of our public API, which allows to navigate the Software Heritage archive programmatically. You can have a look at the API documentation for full details about how it works. But to briefly recap: conceptually, our archive is a giant Merkle DAG connecting together all development-related objects we encounter while crawling public VCS repositories, source code releases, and GNU/Linux distribution packages. Examples of the objects we store are: file contents, directories, commits, releases; as well as their metadata, such as: log messages, author information, permission bits, etc. The API we have just released allows to pointwise navigate this huge graph. Using the API you can lookup individual objects by their IDs, retrieve their metadata, and jump from one object to another following links --- e.g., from a commit to the corresponding directory or parent commits, from a release to the annotated commit, etc. Additionally, you can retrieve crawling-related information, such as the software origins we track (usually as VCS clone/checkout URLs), and the full list of visits we have done on any known software origin. This allows, for instance, to know when we took snapshots of a Git repository you care about and, for each visit, where each branch of the repo was pointing to at that time. Our resources for offering the API as a public service are still quite limited. This is the reason why you will encounter a couple of limitations. First, no download of the actual content of files we have stored is possible yet --- you can retrieve all content-related metadata (e.g., checksums, detected file types and languages, etc.), but not the actual content as a byte sequence. Second, some pretty severe rate limits apply; API access is entirely anonymous and users are identified by their IP address, each "user" will be able to do a little bit more than 100 requests/hour. This is to keep our infrastructure sane while we grow in capacity and focus our attention to developing other archive features. If you're interested in having rate limits lifted for a specific use case or experiment, please contact us and we will see what we can do to help. If you'd like to contribute to increase our resource pool, have a look at our sponsorship program!

28 January 2017

Bits from Debian: Debian at FOSDEM 2017

On February 4th and 5th, Debian will be attending FOSDEM 2017 in Brussels, Belgium; a yearly gratis event (no registration needed) run by volunteers from the Open Source and Free Software community. It's free, and it's big: more than 600 speakers, over 600 events, in 29 rooms. This year more than 45 current or past Debian contributors will speak at FOSDEM: Alexandre Viau, Bradley M. Kuhn, Daniel Pocock, Guus Sliepen, Johan Van de Wauw, John Sullivan, Josh Triplett, Julien Danjou, Keith Packard, Martin Pitt, Peter Van Eynde, Richard Hartmann, Sebastian Dr ge, Stefano Zacchiroli and Wouter Verhelst, among others. Similar to previous years, the event will be hosted at Universit libre de Bruxelles. Debian contributors and enthusiasts will be taking shifts at the Debian stand with gadgets, T-Shirts and swag. You can find us at stand number 4 in building K, 1 B; CoreOS Linux and PostgreSQL will be our neighbours. See https://wiki.debian.org/DebianEvents/be/2017/FOSDEM for more details. We are looking forward to meeting you all!

28 November 2016

Stefano Zacchiroli: last week to take part in the Debian Contributors Survey

Debian Contributors Survey 2016 About 3 weeks ago, together with Molly and Mathieu, we launched the first edition of the Debian Contributors Survey. I won't harp on it any further, because you can find all relevant information about it on the Debian blog or as part of the original announcement. But it's worth noting that you've now only one week left to participate if you want to: the deadline for participation is 4 December 2016, at 23:59 UTC. If you're a Debian contributor and would like to participate, just go to the survey participation page and fill in!

16 November 2016

Bits from Debian: Debian Contributors Survey 2016

The Debian Contributor Survey launched last week! In order to better understand and document who contributes to Debian, we (Mathieu ONeil, Molly de Blanc, and Stefano Zacchiroli) have created this survey to capture the current state of participation in the Debian Project through the lense of common demographics. We hope a general survey will become an annual effort, and that each year there will also be a focus on a specific aspect of the project or community. The 2016 edition contains sections concerning work, employment, and labour issues in order to learn about who is getting paid to work on and with Debian, and how those relationships affect contributions. We want to hear from as many Debian contributors as possible whether you've submitted a bug report, attended a DebConf, reviewed translations, maintain packages, participated in Debian teams, or are a Debian Developer. Completing the survey should take 10-30 minutes, depending on your current involvement with the project and employment status. In an effort to reflect our own ideals as well as those of the Debian project, we are using LimeSurvey, an entirely free software survey tool, in an instance of it hosted by the LimeSurvey developers. Survey responses are anonymous, IP and HTTP information are not logged, and all questions are optional. As it is still likely possible to determine who a respondent is based on their answers, results will only be distributed in aggregate form, in a way that does not allow deanonymization. The results of the survey will be analyzed as part of ongoing research work by the organizers. A report discussing the results will be published under a DFSG-free license and distributed to the Debian community as soon as it's ready. The raw, disaggregated answers will not be distributed and will be kept under the responsibility of the organizers. We hope you will fill out the Debian Contributor Survey. The deadline for participation is: 4 December 2016, at 23:59 UTC. If you have any questions, don't hesitate to contact us via email at:

Mathieu ONeil mathieu.oneil@canberra.edu.au
Molly de Blanc deblanc@riseup.net
Stefano Zacchiroli zack@debian.org

Next.