Search Results: "zack"

22 January 2024

Chris Lamb: Increasing the Integrity of Software Supply Chains awarded IEEE Best Paper award

IEEE Software recently announced that a paper that I co-authored with Dr. Stefano Zacchiroli has recently been awarded their Best Paper award:
Titled Reproducible Builds: Increasing the Integrity of Software Supply Chains, the abstract reads as follows:
Although it is possible to increase confidence in Free and Open Source Software (FOSS) by reviewing its source code, trusting code is not the same as trusting its executable counterparts. These are typically built and distributed by third-party vendors with severe security consequences if their supply chains are compromised. In this paper, we present reproducible builds, an approach that can determine whether generated binaries correspond with their original source code. We first define the problem and then provide insight into the challenges of making real-world software build in a "reproducible" manner that is, when every build generates bit-for-bit identical results. Through the experience of the Reproducible Builds project making the Debian Linux distribution reproducible, we also describe the affinity between reproducibility and quality assurance (QA).
According to Google Scholar, the paper has accumulated almost 40 citations since publication. The full text of the paper can be found in PDF format.

11 January 2024

Reproducible Builds: Reproducible Builds in December 2023

Welcome to the December 2023 report from the Reproducible Builds project! In these reports we outline the most important things that we have been up to over the past month. As a rather rapid recap, whilst anyone may inspect the source code of free software for malicious flaws, almost all software is distributed to end users as pre-compiled binaries (more).

Reproducible Builds: Increasing the Integrity of Software Supply Chains awarded IEEE Software Best Paper award In February 2022, we announced in these reports that a paper written by Chris Lamb and Stefano Zacchiroli was now available in the March/April 2022 issue of IEEE Software. Titled Reproducible Builds: Increasing the Integrity of Software Supply Chains (PDF). This month, however, IEEE Software announced that this paper has won their Best Paper award for 2022.

Reproducibility to affect package migration policy in Debian In a post summarising the activities of the Debian Release Team at a recent in-person Debian event in Cambridge, UK, Paul Gevers announced a change to the way packages are migrated into the staging area for the next stable Debian release based on its reproducibility status:
The folks from the Reproducibility Project have come a long way since they started working on it 10 years ago, and we believe it s time for the next step in Debian. Several weeks ago, we enabled a migration policy in our migration software that checks for regression in reproducibility. At this moment, that is presented as just for info, but we intend to change that to delays in the not so distant future. We eventually want all packages to be reproducible. To stimulate maintainers to make their packages reproducible now, we ll soon start to apply a bounty [speedup] for reproducible builds, like we ve done with passing autopkgtests for years. We ll reduce the bounty for successful autopkgtests at that moment in time.

Speranza: Usable, privacy-friendly software signing Kelsey Merrill, Karen Sollins, Santiago Torres-Arias and Zachary Newman have developed a new system called Speranza, which is aimed at reassuring software consumers that the product they are getting has not been tampered with and is coming directly from a source they trust. A write-up on TechXplore.com goes into some more details:
What we have done, explains Sollins, is to develop, prove correct, and demonstrate the viability of an approach that allows the [software] maintainers to remain anonymous. Preserving anonymity is obviously important, given that almost everyone software developers included value their confidentiality. This new approach, Sollins adds, simultaneously allows [software] users to have confidence that the maintainers are, in fact, legitimate maintainers and, furthermore, that the code being downloaded is, in fact, the correct code of that maintainer. [ ]
The corresponding paper is published on the arXiv preprint server in various formats, and the announcement has also been covered in MIT News.

Nondeterministic Git bundles Paul Baecher published an interesting blog post on Reproducible git bundles. For those who are not familiar with them, Git bundles are used for the offline transfer of Git objects without an active server sitting on the other side of a network connection. Anyway, Paul wrote about writing a backup system for his entire system, but:
I noticed that a small but fixed subset of [Git] repositories are getting backed up despite having no changes made. That is odd because I would think that repeated bundling of the same repository state should create the exact same bundle. However [it] turns out that for some, repositories bundling is nondeterministic.
Paul goes on to to describe his solution, which involves forcing git to be single threaded makes the output deterministic . The article was also discussed on Hacker News.

Output from libxlst now deterministic libxslt is the XSLT C library developed for the GNOME project, where XSLT itself is an XML language to define transformations for XML files. This month, it was revealed that the result of the generate-id() XSLT function is now deterministic across multiple transformations, fixing many issues with reproducible builds. As the Git commit by Nick Wellnhofer describes:
Rework the generate-id() function to return deterministic values. We use
a simple incrementing counter and store ids in the 'psvi' member of
nodes which was freed up by previous commits. The presence of an id is
indicated by a new "source node" flag.
This fixes long-standing problems with reproducible builds, see
https://bugzilla.gnome.org/show_bug.cgi?id=751621
This also hardens security, as the old implementation leaked the
difference between a heap and a global pointer, see
https://bugs.chromium.org/p/chromium/issues/detail?id=1356211
The old implementation could also generate the same id for dynamically
created nodes which happened to reuse the same memory. Ids for namespace
nodes were completely broken. They now use the id of the parent element
together with the hex-encoded namespace prefix.

Community updates There were made a number of improvements to our website, including Chris Lamb fixing the generate-draft script to not blow up if the input files have been corrupted today or even in the past [ ], Holger Levsen updated the Hamburg 2023 summit to add a link to farewell post [ ] & to add a picture of a Post-It note. [ ], and Pol Dellaiera updated the paragraph about tar and the --clamp-mtime flag [ ]. On our mailing list this month, Bernhard M. Wiedemann posted an interesting summary on some of the reasons why packages are still not reproducible in 2023. diffoscope is our in-depth and content-aware diff utility that can locate and diagnose reproducibility issues. This month, Chris Lamb made a number of changes, including processing objdump symbol comment filter inputs as Python byte (and not str) instances [ ] and Vagrant Cascadian extended diffoscope support for GNU Guix [ ] and updated the version in that distribution to version 253 [ ].

Challenges of Producing Software Bill Of Materials for Java Musard Balliu, Benoit Baudry, Sofia Bobadilla, Mathias Ekstedt, Martin Monperrus, Javier Ron, Aman Sharma, Gabriel Skoglund, C sar Soto-Valero and Martin Wittlinger (!) of the KTH Royal Institute of Technology in Sweden, have published an article in which they:
deep-dive into 6 tools and the accuracy of the SBOMs they produce for complex open-source Java projects. Our novel insights reveal some hard challenges regarding the accurate production and usage of software bills of materials.
The paper is available on arXiv.

Debian Non-Maintainer campaign As mentioned in previous reports, the Reproducible Builds team within Debian has been organising a series of online and offline sprints in order to clear the huge backlog of reproducible builds patches submitted by performing so-called NMUs (Non-Maintainer Uploads). During December, Vagrant Cascadian performed a number of such uploads, including: In addition, Holger Levsen performed three no-source-change NMUs in order to address the last packages without .buildinfo files in Debian trixie, specifically lorene (0.0.0~cvs20161116+dfsg-1.1), maria (1.3.5-4.2) and ruby-rinku (1.7.3-2.1).

Reproducibility testing framework The Reproducible Builds project operates a comprehensive testing framework (available at tests.reproducible-builds.org) in order to check packages and other artifacts for reproducibility. In December, a number of changes were made by Holger Levsen:
  • Debian-related changes:
    • Fix matching packages for the [R programming language](https://en.wikipedia.org/wiki/R_(programming_language). [ ][ ][ ]
    • Add a Certbot configuration for the Nginx web server. [ ]
    • Enable debugging for the create-meta-pkgs tool. [ ][ ]
  • Arch Linux-related changes
    • The asp has been deprecated by pkgctl; thanks to dvzrv for the pointer. [ ]
    • Disable the Arch Linux builders for now. [ ]
    • Stop referring to the /trunk branch / subdirectory. [ ]
    • Use --protocol https when cloning repositories using the pkgctl tool. [ ]
  • Misc changes:
    • Install the python3-setuptools and swig packages, which are now needed to build OpenWrt. [ ]
    • Install pkg-config needed to build Coreboot artifacts. [ ]
    • Detect failures due to an issue where the fakeroot tool is implicitly required but not automatically installed. [ ]
    • Detect failures due to rename of the vmlinuz file. [ ]
    • Improve the grammar of an error message. [ ]
    • Document that freebsd-jenkins.debian.net has been updated to FreeBSD 14.0. [ ]
In addition, node maintenance was performed by Holger Levsen [ ] and Vagrant Cascadian [ ].

Upstream patches The Reproducible Builds project detects, dissects and attempts to fix as many currently-unreproducible packages as possible. We endeavour to send all of our patches upstream where appropriate. This month, we wrote a large number of such patches, including:

If you are interested in contributing to the Reproducible Builds project, please visit our Contribute page on our website. However, you can get in touch with us via:

13 July 2022

Reproducible Builds: Reproducible Builds in June 2022

Welcome to the June 2022 report from the Reproducible Builds project. In these reports, we outline the most important things that we have been up to over the past month. As a quick recap, whilst anyone may inspect the source code of free software for malicious flaws, almost all software is distributed to end users as pre-compiled binaries.

Save the date! Despite several delays, we are pleased to announce dates for our in-person summit this year: November 1st 2022 November 3rd 2022
The event will happen in/around Venice (Italy), and we intend to pick a venue reachable via the train station and an international airport. However, the precise venue will depend on the number of attendees. Please see the announcement mail from Mattia Rizzolo, and do keep an eye on the mailing list for further announcements as it will hopefully include registration instructions.

News David Wheeler filed an issue against the Rust programming language to report that builds are not reproducible because full path to the source code is in the panic and debug strings . Luckily, as one of the responses mentions: the --remap-path-prefix solves this problem and has been used to great effect in build systems that rely on reproducibility (Bazel, Nix) to work at all and that there are efforts to teach cargo about it here .
The Python Security team announced that:
The ctx hosted project on PyPI was taken over via user account compromise and replaced with a malicious project which contained runtime code which collected the content of os.environ.items() when instantiating Ctx objects. The captured environment variables were sent as a base64 encoded query parameter to a Heroku application [ ]
As their announcement later goes onto state, version-pinning using hash-checking mode can prevent this attack, although this does depend on specific installations using this mode, rather than a prevention that can be applied systematically.
Developer vanitasvitae published an interesting and entertaining blog post detailing the blow-by-blow steps of debugging a reproducibility issue in PGPainless, a library which aims to make using OpenPGP in Java projects as simple as possible . Whilst their in-depth research into the internals of the .jar may have been unnecessary given that diffoscope would have identified the, it must be said that there is something to be said with occasionally delving into seemingly low-level details, as well describing any debugging process. Indeed, as vanitasvitae writes:
Yes, this would have spared me from 3h of debugging But I probably would also not have gone onto this little dive into the JAR/ZIP format, so in the end I m not mad.

Kees Cook published a short and practical blog post detailing how he uses reproducibility properties to aid work to replace one-element arrays in the Linux kernel. Kees approach is based on the principle that if a (small) proposed change is considered equivalent by the compiler, then the generated output will be identical but only if no other arbitrary or unrelated changes are introduced. Kees mentions the fantastic diffoscope tool, as well as various kernel-specific build options (eg. KBUILD_BUILD_TIMESTAMP) in order to prepare my build with the known to disrupt code layout options disabled .
Stefano Zacchiroli gave a presentation at GDR S curit Informatique based in part on a paper co-written with Chris Lamb titled Increasing the Integrity of Software Supply Chains. (Tweet)

Debian In Debian in this month, 28 reviews of Debian packages were added, 35 were updated and 27 were removed this month adding to our knowledge about identified issues. Two issue types were added: nondeterministic_checksum_generated_by_coq and nondetermistic_js_output_from_webpack. After Holger Levsen found hundreds of packages in the bookworm distribution that lack .buildinfo files, he uploaded 404 source packages to the archive (with no meaningful source changes). Currently bookworm now shows only 8 packages without .buildinfo files, and those 8 are fixed in unstable and should migrate shortly. By contrast, Debian unstable will always have packages without .buildinfo files, as this is how they come through the NEW queue. However, as these packages were not built on the official build servers (ie. they were uploaded by the maintainer) they will never migrate to Debian testing. In the future, therefore, testing should never have packages without .buildinfo files again. Roland Clobus posted yet another in-depth status report about his progress making the Debian Live images build reproducibly to our mailing list. In this update, Roland mentions that all major desktops build reproducibly with bullseye, bookworm and sid but also goes on to outline the progress made with automated testing of the generated images using openQA.

GNU Guix Vagrant Cascadian made a significant number of contributions to GNU Guix: Elsewhere in GNU Guix, Ludovic Court s published a paper in the journal The Art, Science, and Engineering of Programming called Building a Secure Software Supply Chain with GNU Guix:
This paper focuses on one research question: how can [Guix]((https://www.gnu.org/software/guix/) and similar systems allow users to securely update their software? [ ] Our main contribution is a model and tool to authenticate new Git revisions. We further show how, building on Git semantics, we build protections against downgrade attacks and related threats. We explain implementation choices. This work has been deployed in production two years ago, giving us insight on its actual use at scale every day. The Git checkout authentication at its core is applicable beyond the specific use case of Guix, and we think it could benefit to developer teams that use Git.
A full PDF of the text is available.

openSUSE In the world of openSUSE, SUSE announced at SUSECon that they are preparing to meet SLSA level 4. (SLSA (Supply chain Levels for Software Artifacts) is a new industry-led standardisation effort that aims to protect the integrity of the software supply chain.) However, at the time of writing, timestamps within RPM archives are not normalised, so bit-for-bit identical reproducible builds are not possible. Some in-toto provenance files published for SUSE s SLE-15-SP4 as one result of the SLSA level 4 effort. Old binaries are not rebuilt, so only new builds (e.g. maintenance updates) have this metadata added. Lastly, Bernhard M. Wiedemann posted his usual monthly openSUSE reproducible builds status report.

diffoscope diffoscope is our in-depth and content-aware diff utility. Not only can it locate and diagnose reproducibility issues, it can provide human-readable diffs from many kinds of binary formats. This month, Chris Lamb prepared and uploaded versions 215, 216 and 217 to Debian unstable. Chris Lamb also made the following changes:
  • New features:
    • Print profile output if we were called with --profile and we were killed via a TERM signal. This should help in situations where diffoscope is terminated due to some sort of timeout. [ ]
    • Support both PyPDF 1.x and 2.x. [ ]
  • Bug fixes:
    • Also catch IndexError exceptions (in addition to ValueError) when parsing .pyc files. (#1012258)
    • Correct the logic for supporting different versions of the argcomplete module. [ ]
  • Output improvements:
    • Don t leak the (likely-temporary) pathname when comparing PDF documents. [ ]
  • Logging improvements:
    • Update test fixtures for GNU readelf 2.38 (now in Debian unstable). [ ][ ]
    • Be more specific about the minimum required version of readelf (ie. binutils), as it appears that this patch level version change resulted in a change of output, not the minor version. [ ]
    • Use our @skip_unless_tool_is_at_least decorator (NB. at_least) over @skip_if_tool_version_is (NB. is) to fix tests under Debian stable. [ ]
    • Emit a warning if/when we are handling a UNIX TERM signal. [ ]
  • Codebase improvements:
    • Clarify in what situations the main finally block gets called with respect to TERM signal handling. [ ]
    • Clarify control flow in the diffoscope.profiling module. [ ]
    • Correctly package the scripts/ directory. [ ]
In addition, Edward Betts updated a broken link to the RSS on the diffoscope homepage and Vagrant Cascadian updated the diffoscope package in GNU Guix [ ][ ][ ].

Upstream patches The Reproducible Builds project detects, dissects and attempts to fix as many currently-unreproducible packages as possible. We endeavour to send all of our patches upstream where appropriate. This month, we wrote a large number of such patches, including:

Testing framework The Reproducible Builds project runs a significant testing framework at tests.reproducible-builds.org, to check packages and other artifacts for reproducibility. This month, the following changes were made:
  • Holger Levsen:
    • Add a package set for packages that use the R programming language [ ] as well as one for Rust [ ].
    • Improve package set matching for Python [ ] and font-related [ ] packages.
    • Install the lz4, lzop and xz-utils packages on all nodes in order to detect running kernels. [ ]
    • Improve the cleanup mechanisms when testing the reproducibility of Debian Live images. [ ][ ]
    • In the automated node health checks, deprioritise the generic kernel warning . [ ]
  • Roland Clobus (Debian Live image reproducibility):
    • Add various maintenance jobs to the Jenkins view. [ ]
    • Cleanup old workspaces after 24 hours. [ ]
    • Cleanup temporary workspace and resulting directories. [ ]
    • Implement a number of fixes and improvements around publishing files. [ ][ ][ ]
    • Don t attempt to preserve the file timestamps when copying artifacts. [ ]
And finally, node maintenance was also performed by Mattia Rizzolo [ ].

Mailing list and website On our mailing list this month: Lastly, Chris Lamb updated the main Reproducible Builds website and documentation in a number of small ways, but primarily published an interview with Hans-Christoph Steiner of the F-Droid project. Chris Lamb also added a Coffeescript example for parsing and using the SOURCE_DATE_EPOCH environment variable [ ]. In addition, Sebastian Crane very-helpfully updated the screenshot of salsa.debian.org s request access button on the How to join the Salsa group. [ ]

Contact If you are interested in contributing to the Reproducible Builds project, please visit our Contribute page on our website. However, you can get in touch with us via:

5 March 2022

Reproducible Builds: Reproducible Builds in February 2022

Welcome to the February 2022 report from the Reproducible Builds project. In these reports, we try to round-up the important things we and others have been up to over the past month. As ever, if you are interested in contributing to the project, please visit our Contribute page on our website.
Jiawen Xiong, Yong Shi, Boyuan Chen, Filipe R. Cogo and Zhen Ming Jiang have published a new paper titled Towards Build Verifiability for Java-based Systems (PDF). The abstract of the paper contains the following:
Various efforts towards build verifiability have been made to C/C++-based systems, yet the techniques for Java-based systems are not systematic and are often specific to a particular build tool (eg. Maven). In this study, we present a systematic approach towards build verifiability on Java-based systems.

GitBOM is a flexible scheme to track the source code used to generate build artifacts via Git-like unique identifiers. Although the project has been active for a while, the community around GitBOM has now started running weekly community meetings.
The paper Chris Lamb and Stefano Zacchiroli is now available in the March/April 2022 issue of IEEE Software. Titled Reproducible Builds: Increasing the Integrity of Software Supply Chains (PDF), the abstract of the paper contains the following:
We first define the problem, and then provide insight into the challenges of making real-world software build in a reproducible manner-this is, when every build generates bit-for-bit identical results. Through the experience of the Reproducible Builds project making the Debian Linux distribution reproducible, we also describe the affinity between reproducibility and quality assurance (QA).

In openSUSE, Bernhard M. Wiedemann posted his monthly reproducible builds status report.
On our mailing list this month, Thomas Schmitt started a thread around the SOURCE_DATE_EPOCH specification related to formats that cannot help embedding potentially timezone-specific timestamp. (Full thread index.)
The Yocto Project is pleased to report that it s core metadata (OpenEmbedded-Core) is now reproducible for all recipes (100% coverage) after issues with newer languages such as Golang were resolved. This was announced in their recent Year in Review publication. It is of particular interest for security updates so that systems can have specific components updated but reducing the risk of other unintended changes and making the sections of the system changing very clear for audit. The project is now also making heavy use of equivalence of build output to determine whether further items in builds need to be rebuilt or whether cached previously built items can be used. As mentioned in the article above, there are now public servers sharing this equivalence information. Reproducibility is key in making this possible and effective to reduce build times/costs/resource usage.

diffoscope diffoscope is our in-depth and content-aware diff utility. Not only can it locate and diagnose reproducibility issues, it can provide human-readable diffs from many kinds of binary formats. This month, Chris Lamb prepared and uploaded versions 203, 204, 205 and 206 to Debian unstable, as well as made the following changes to the code itself:
  • Bug fixes:
    • Fix a file(1)-related regression where Debian .changes files that contained non-ASCII text were not identified as such, therefore resulting in seemingly arbitrary packages not actually comparing the nested files themselves. The non-ASCII parts were typically in the Maintainer or in the changelog text. [ ][ ]
    • Fix a regression when comparing directories against non-directories. [ ][ ]
    • If we fail to scan using binwalk, return False from BinwalkFile.recognizes. [ ]
    • If we fail to import binwalk, don t report that we are missing the Python rpm module! [ ]
  • Testsuite improvements:
    • Add a test for recent file(1) issue regarding .changes files. [ ]
    • Use our assert_diff utility where we can within the test_directory.py set of tests. [ ]
    • Don t run our binwalk-related tests as root or fakeroot. The latest version of binwalk has some new security protection against this. [ ]
  • Codebase improvements:
    • Drop the _PATH suffix from module-level globals that are not paths. [ ]
    • Tidy some control flow in Difference._reverse_self. [ ]
    • Don t print a warning to the console regarding NT_GNU_BUILD_ID changes. [ ]
In addition, Mattia Rizzolo updated the Debian packaging to ensure that diffoscope and diffoscope-minimal packages have the same version. [ ]

Website updates There were quite a few changes to the Reproducible Builds website and documentation this month as well, including:
  • Chris Lamb:
    • Considerably rework the Who is involved? page. [ ][ ]
    • Move the contributors.sh Bash/shell script into a Python script. [ ][ ][ ]
  • Daniel Shahaf:
    • Try a different Markdown footnote content syntax to work around a rendering issue. [ ][ ][ ]
  • Holger Levsen:
    • Make a huge number of changes to the Who is involved? page, including pre-populating a large number of contributors who cannot be identified from the metadata of the website itself. [ ][ ][ ][ ][ ]
    • Improve linking to sponsors in sidebar navigation. [ ]
    • drop sponsors paragraph as the navigation is clearer now. [ ]
    • Add Mullvad VPN as a bronze-level sponsor . [ ][ ]
  • Vagrant Cascadian:

Upstream patches The Reproducible Builds project attempts to fix as many currently-unreproducible packages as possible. February s patches included the following:

Testing framework The Reproducible Builds project runs a significant testing framework at tests.reproducible-builds.org, to check packages and other artifacts for reproducibility. This month, the following changes were made:
  • Daniel Golle:
    • Update the OpenWrt configuration to not depend on the host LLVM, adding lines to the .config seed to build LLVM for eBPF from source. [ ]
    • Preserve more OpenWrt-related build artifacts. [ ]
  • Holger Levsen:
  • Temporary use a different Git tree when building OpenWrt as our tests had been broken since September 2020. This was reverted after the patch in question was accepted by Paul Spooren into the canonical openwrt.git repository the next day.
    • Various improvements to debugging OpenWrt reproducibility. [ ][ ][ ][ ][ ]
    • Ignore useradd warnings when building packages. [ ]
    • Update the script to powercycle armhf architecture nodes to add a hint to where nodes named virt-*. [ ]
    • Update the node health check to also fix failed logrotate and man-db services. [ ]
  • Mattia Rizzolo:
    • Update the website job after contributors.sh script was rewritten in Python. [ ]
    • Make sure to set the DIFFOSCOPE environment variable when available. [ ]
  • Vagrant Cascadian:
    • Various updates to the diffoscope timeouts. [ ][ ][ ]
Node maintenance was also performed by Holger Levsen [ ] and Vagrant Cascadian [ ].

Finally If you are interested in contributing to the Reproducible Builds project, please visit our Contribute page on our website. However, you can get in touch with us via:

8 October 2021

Chris Lamb: Reproducible Builds: Increasing the Integrity of Software Supply Chains (2021)

I didn't blog about it at the time, but a paper I co-authored with Stefano Zacchiroli was accepted by IEEE Software in April of this year. Titled Reproducible Builds: Increasing the Integrity of Software Supply Chains, the abstract of the paper is as follows:
Although it is possible to increase confidence in Free and Open Source Software (FOSS) by reviewing its source code, trusting code is not the same as trusting its executable counterparts. These are typically built and distributed by third-party vendors with severe security consequences if their supply chains are compromised. In this paper, we present reproducible builds, an approach that can determine whether generated binaries correspond with their original source code. We first define the problem and then provide insight into the challenges of making real-world software build in a "reproducible" manner that is, when every build generates bit-for-bit identical results. Through the experience of the Reproducible Builds project making the Debian Linux distribution reproducible, we also describe the affinity between reproducibility and quality assurance (QA).
The full text of the paper can be found in PDF format and should appear, with an alternative layout, within a forthcoming issue of the physical IEEE Software magazine.

25 August 2021

Jelmer Vernooij: Thousands of Debian packages updated from their upstream Git repository

The Debian Janitor is an automated system that commits fixes for (minor) issues in Debian packages that can be fixed by software. It gradually started proposing merges in early December. The first set of changes sent out ran lintian-brush on sid packages maintained in Git. This post is part of a series about the progress of the Janitor. Linux distributions like Debian fulfill an important function in the FOSS ecosystem - they are system integrators that take existing free and open source software projects and adapt them where necessary to work well together. They also make it possible for users to install more software in an easy and consistent way and with some degree of quality control and review. One of the consequences of this model is that the distribution package often lags behind upstream releases. This is especially true for distributions that have tighter integration and standardization (such as Debian), and often new upstream code is only imported irregularly because it is a manual process - both updating the package, but also making sure that it still works together well with the rest of the system. The process of importing a new upstream used to be (well, back when I started working on Debian packages) fairly manual and something like this:

Ecosystem Improvements However, there have been developments over the last decade that make it easier to import new upstream releases into Debian packages.
Uscan and debian QA watch Uscan and debian/watch have been around for a while and make it possible to find upstream tarballs. A debian watch file usually looks something like this:
1
2
version=4
http://somesite.com/dir/filenamewithversion.tar.gz
The QA watch service regularly polls all watch locations in the archive and makes the information available, so it s possible to know which packages have changed without downloading each one of them.
Git Git is fairly ubiquitous nowadays, and most upstream projects and packages in Debian use it. There are still exceptions that do not use any version control system or that use a different control system, but they are becoming increasingly rare. [1]
debian/upstream/metadata DEP-12 specifies a file format with metadata about the upstream project that a package was based on. In particular relevant for our case is the fact it has fields for the location of the upstream version control location. debian/upstream/metadata files look something like this:
1
2
3
---
Repository: https://www.dulwich.io/code/dulwich/
Repository-Browse: https://www.dulwich.io/code/dulwich/
While DEP-12 is still a draft, it has already been widely adopted - there are about 10000 packages in Debian that ship a debian/upstream/metadata file with Repository information.
Autopkgtest The Autopkgtest standard and associated tooling provide a way to run a defined set of tests against an installed package. This makes it possible to verify that a package is working correctly as part of the system as a whole. ci.debian.net regularly runs these tests against Debian packages to detect regressions.
Vcs-Git headers The Vcs-Git headers in debian/control are the equivalent of the Repository field in debian/upstream/metadata, but for the packaging repositories (as opposed to the upstream ones). They ve been around for a while and are widely adopted, as can be seen from zack s stats: The vcswatch service that regularly polls packaging repositories to see whether they have changed makes it a lot easier to consume this information in usable way.
Debhelper adoption Over the last couple of years, Debian has slowly been converging on a single build tool - debhelper s dh interface. Being able to rely on a single build tool makes it easier to write code to update packaging when upstream changes require it.
Debhelper DWIM Debhelper (and its helpers) increasingly can figure out how to do the Right Thing in many cases without being explicitly configured. This makes packaging less effort, but also means that it s less likely that importing a new upstream version will require updates to the packaging. With all of these improvements in place, it actually becomes feasible in a lot of situations to update a Debian package to a new upstream version automatically. Of course, this requires that all of this information is available, so it won t work for all packages. In some cases, the packaging for the older upstream version might not apply to the newer upstream version. The Janitor has attempted to import a new upstream Git snapshot and a new upstream release for every package in the archive where a debian/watch file or debian/upstream/metadata file are present. These are the steps it uses:
  • Find new upstream version
    • If release, use debian/watch - or maybe tagged in upstream repository
    • If snapshot, use debian/upstream/metadata s Repository field
    • If neither is available, use guess-upstream-metadata from upstream-ontologist to guess the upstream Repository
  • Merge upstream version into packaging repository, possibly importing tarballs using pristine-tar
  • Update the changelog file to mention the new upstream version
  • Run some checks to ensure there are no unintentional changes, e.g.:
    • Scan diff between old and new for surprising license changes
      • Today, abort if there are any - in the future, maybe update debian/copyright
    • Check for obvious compatibility breaks - e.g. sonames changing
  • Attempt to update the packaging to reflect upstream changes
    • Refresh patches
  • Attempt to build the package with deb-fix-build, to deal with any missing dependencies
  • Run the autopkgtests with deb-fix-build to deal with missing dependencies, and abort if any tests fail
Results When run over all packages in unstable (sid), this process works for a surprising number of them.
Fresh Releases For fresh-releases (aka imports of upstream releases), processing all packages maintained in Git for which QA watch reports new releases (about 11,000): That means about 2300 packages updated, and about 4000 unchanged.
Fresh Snapshots For fresh-snapshots (aka imports of latest Git commit from upstream), processing all packages maintained in Git (about 26,000): Or 5100 packages updated and 2100 for which there was nothing to do, i.e. no upstream commits since the last Debian upload. As can be seen, this works for a surprising fraction of packages. It s possible to get the numbers up even higher, by both improving the tooling, the autopkgtests and the metadata that is provided by packages.
Using these packages All the packages that have been built can be accessed from the Janitor APT repository. More information can be found at https://janitor.debian.net/fresh, but in short - run:
1
2
3
4
5
6
echo deb "[arch=amd64 signed-by=/usr/share/keyrings/debian-janitor-archive-keyring.gpg]" \
    https://janitor.debian.net/ fresh-snapshots main   sudo tee /etc/apt/sources.list.d/fresh-snapshots.list
echo deb "[arch=amd64 signed-by=/usr/share/keyrings/debian-janitor-archive-keyring.gpg]" \
    https://janitor.debian.net/ fresh-releases main   sudo tee /etc/apt/sources.list.d/fresh-releases.list
sudo curl -o /usr/share/keyrings/debian-janitor-archive-keyring.gpg https://janitor.debian.net/pgp_keys
apt update
And then you can install packages from the fresh-snapshots (upstream git snapshots) or fresh-releases suites on a case-by-case basis by running something like:
1
apt install -t fresh-snapshots r-cran-roxygen2
Most packages are updated based on information provided by vcswatch and qa watch, but it s also possible for upstream repositories to call a web hook to trigger a refresh of a package. These packages were built against unstable, but should in almost all cases also work for testing.
Caveats Of course, since these packages are built automatically without human supervision it s likely that some of them will have bugs in them that would otherwise have been caught by the maintainer.
[1]I m not saying that a monoculture is great here, but it does help distributions.

31 December 2020

Chris Lamb: Free software activities in December 2020

Here is my monthly update covering what I have been doing in the free software world during December 2020 (previous month):

Reproducible Builds One of the original promises of open source software is that distributed peer review and transparency of process results in enhanced end-user security. However, whilst anyone may inspect the source code of free and open source software for malicious flaws, almost all software today is distributed as pre-compiled binaries. This allows nefarious third-parties to compromise systems by injecting malicious code into ostensibly secure software during the various compilation and distribution processes. The motivation behind the Reproducible Builds effort is to ensure no flaws have been introduced during this compilation process by promising identical results are always generated from a given source, thus allowing multiple third-parties to come to a consensus on whether a build was compromised. This month, I: I also made the following changes to diffoscope, our in-depth and content-aware diff utility that can locate and diagnose reproducibility issues, including releasing version 163:

Debian Uploads I also sponsored an upload of adminer (4.7.8-2) on behalf of Alexandre Rossi and performed two QA uploads of sendfile (2.1b.20080616-7 and 2.1b.20080616-8) to make the build the build reproducible (#776938) and to fix a number of other unrelated issues. Debian LTS This month I have worked 18 hours on Debian Long Term Support (LTS) and 12 hours on its sister Extended LTS project. You can find out more about the Debian LTS project via the following video:

30 November 2020

Chris Lamb: Free software activities in November 2020

Here is my monthly update covering what I have been doing in the free software world during November 2020 (previous month):

Reproducible Builds One of the original promises of open source software is that distributed peer review and transparency of process results in enhanced end-user security. However, whilst anyone may inspect the source code of free and open source software for malicious flaws, almost all software today is distributed as pre-compiled binaries. This allows nefarious third-parties to compromise systems by injecting malicious code into ostensibly secure software during the various compilation and distribution processes. The motivation behind the Reproducible Builds effort is to ensure no flaws have been introduced during this compilation process by promising identical results are always generated from a given source, thus allowing multiple third-parties to come to a consensus on whether a build was compromised. The project is proud to be a member project of the Software Freedom Conservancy. Conservancy acts as a corporate umbrella allowing projects to operate as non-profit initiatives without managing their own corporate structure. If you like the work of the Conservancy or the Reproducible Builds project, please consider becoming an official supporter.
This month, I:
I also made the following changes to diffoscope:

Debian I performed the following uploads to the Debian Linux distribution this month: I also filed a release-critical bug against the minidlna package which could not be successfully purged from the system without reporting a cannot remove '/var/log/minidlna' error. (#975372)

Debian LTS This month I have worked 18 hours on Debian Long Term Support (LTS) and 12 hours on its sister Extended LTS project, including: You can find out more about the Debian LTS project via the following video:

24 October 2020

Jelmer Vernooij: Debian Janitor: Hosters used by Debian packages

The Debian Janitor is an automated system that commits fixes for (minor) issues in Debian packages that can be fixed by software. It gradually started proposing merges in early December. The first set of changes sent out ran lintian-brush on sid packages maintained in Git. This post is part of a series about the progress of the Janitor. The Janitor knows how to talk to different hosting platforms. For each hosting platform, it needs to support the platform- specific API for creating and managing merge proposals. For each hoster it also needs to have credentials. At the moment, it supports the GitHub API, Launchpad API and GitLab API. Both GitHub and Launchpad have only a single instance; the GitLab instances it supports are gitlab.com and salsa.debian.org. This provides coverage for the vast majority of Debian packages that can be accessed using Git. More than 75% of all packages are available on salsa - although in some cases, the Vcs-Git header has not yet been updated. Of the other 25%, the majority either does not declare where it is hosted using a Vcs-* header (10.5%), or have not yet migrated from alioth to another hosting platform (9.7%). A further 2.3% are hosted somewhere on GitHub (2%), Launchpad (0.18%) or GitLab.com (0.15%), in many cases in the same repository as the upstream code. The remaining 1.6% are hosted on many other hosts, primarily people s personal servers (which usually don t have an API for creating pull requests). Packages per hoster

Outdated Vcs-* headers It is possible that the 20% of packages that do not have a Vcs-* header or have a Vcs header that say there on alioth are actually hosted elsewhere. However, it is hard to know where they are until a version with an updated Vcs-Git header is uploaded. The Janitor primarily relies on vcswatch to find the correct locations of repositories. vcswatch looks at Vcs-* headers but has its own heuristics as well. For about 2,000 packages (6%) that still have Vcs-* headers that point to alioth, vcswatch successfully finds their new home on salsa.
Merge Proposals by Hoster These proportions are also visible in the number of pull requests created by the Janitor on various hosters. The vast majority so far has been created on Salsa.
Hoster Open Merged & Applied Closed
github.com921685
gitlab.com1230
code.launchpad.net24511
salsa.debian.org1,3605,657126
Merge Proposal statistics In this graph, Open means that the pull request has been created but likely nobody has looked at it yet. Merged means that the pull request has been marked as merged on the hoster, and applied means that the changes have ended up in the packaging branch but via a different route (e.g. cherry-picked or manually applied). Closed means that the pull request was closed without the changes being incorporated. Note that this excludes ~5,600 direct pushes, all of which were to salsa-hosted repositories. See also:

For more information about the Janitor's lintian-fixes efforts, see the landing page.

23 August 2017

Antoine Beaupr : The supposed decline of copyleft

At DebConf17, John Sullivan, the executive director of the FSF, gave a talk on the supposed decline of the use of copyleft licenses use free-software projects. In his presentation, Sullivan questioned the notion that permissive licenses, like the BSD or MIT licenses, are gaining ground at the expense of the traditionally dominant copyleft licenses from the FSF. While there does seem to be a rise in the use of permissive licenses, in general, there are several possible explanations for the phenomenon.

When the rumor mill starts Sullivan gave a recent example of the claim of the decline of copyleft in an article on Opensource.com by Jono Bacon from February 2017 that showed a histogram of license usage between 2010 and 2017 (seen below).
[Black Duck   histogram]
From that, Bacon elaborates possible reasons for the apparent decline of the GPL. The graphic used in the article was actually generated by Stephen O'Grady in a January article, The State Of Open Source Licensing, which said:
In Black Duck's sample, the most popular variant of the GPL version 2 is less than half as popular as it was (46% to 19%). Over the same span, the permissive MIT has gone from 8% share to 29%, while its permissive cousin the Apache License 2.0 jumped from 5% to 15%.
Sullivan, however, argued that the methodology used to create both articles was problematic. Neither contains original research: the graphs actually come from the Black Duck Software "KnowledgeBase" data, which was partly created from the old Ohloh web site now known as Open Hub. To show one problem with the data, Sullivan mentioned two free-software projects, GNU Bash and GNU Emacs, that had been showcased on the front page of Ohloh.net in 2012. On the site, Bash was (and still is) listed as GPLv2+, whereas it changed to GPLv3 in 2011. He also claimed that "Emacs was listed as licensed under GPLv3-only, which is a license Emacs has never had in its history", although I wasn't able to verify that information from the Internet archive. Basically, according to Sullivan, "the two projects featured on the front page of a site that was using [the Black Duck] data set were wrong". This, in turn, seriously brings into question the quality of the data:
I reported this problem and we'll continue to do that but when someone is not sharing the data set that they're using for other people to evaluate it and we see glimpses of it which are incorrect, that should give us a lot of hesitation about accepting any conclusion that comes out of it.
Reproducible observations are necessary to the establishment of solid theories in science. Sullivan didn't try to contact Black Duck to get access to the database, because he assumed (rightly, as it turned out) that he would need to "pay for the data under terms that forbid you to share that information with anybody else". So I wrote Black Duck myself to confirm this information. In an email interview, Patrick Carey from Black Duck confirmed its data set is proprietary. He believes, however, that through a "combination of human and automated techniques", Black Duck is "highly confident at the accuracy and completeness of the data in the KnowledgeBase". He did point out, however, that "the way we track the data may not necessarily be optimal for answering the question on license use trend" as "that would entail examination of new open source projects coming into existence each year and the licenses used by them". In other words, even according to Black Duck, its database may not be useful to establish the conclusions drawn by those articles. Carey did agree with those conclusions intuitively, however, saying that "there seems to be a shift toward Apache and MIT licenses in new projects, though I don't have data to back that up". He suggested that "an effective way to answer the trend question would be to analyze the new projects on GitHub over the last 5-10 years." Carey also suggested that "GitHub has become so dominant over the recent years that just looking at projects on GitHub would give you a reasonable sampling from which to draw conclusions".
[GitHub   graph]
Indeed, GitHub published a report in 2015 that also seems to confirm MIT's popularity (45%), surpassing copyleft licenses (24%). The data is, however, not without its own limitations. For example, in the above graph going back to the inception of GitHub in 2008, we see a rather abnormal spike in 2013, which seems to correlate with the launch of the choosealicense.com site, described by GitHub as "our first pass at making open source licensing on GitHub easier". In his talk, Sullivan was critical of the initial version of the site which he described as biased toward permissive licenses. Because the GitHub project creation page links to the site, Sullivan explained that the site's bias could have actually influenced GitHub users' license choices. Following a talk from Sullivan at FOSDEM 2016, GitHub addressed the problem later that year by rewording parts of the front page to be more accurate, but that any change in license choice obviously doesn't show in the report produced in 2015 and won't affect choices users have already made. Therefore, there can be reasonable doubts that GitHub's subset of software projects may not actually be that representative of the larger free-software community.

In search of solid evidence So it seems we are missing good, reproducible results to confirm or dispel these claims. Sullivan explained that it is a difficult problem, if only in the way you select which projects to analyze: the impact of a MIT-licensed personal wiki will obviously be vastly different from, say, a GPL-licensed C compiler or kernel. We may want to distinguish between active and inactive projects. Then there is the problem of code duplication, both across publication platforms (a project may be published on GitHub and SourceForge for example) but also across projects (code may be copy-pasted between projects). We should think about how to evaluate the license of a given project: different files in the same code base regularly have different licenses often none at all. This is why having a clear, documented and publicly available data set and methodology is critical. Without this, the assumptions made are not clear and it is unreasonable to draw certain conclusions from the results. It turns out that some researchers did that kind of open research in 2016 in a paper called "The Debsources Dataset: Two Decades of Free and Open Source Software" [PDF] by Matthieu Caneill, Daniel M. Germ n, and Stefano Zacchiroli. The Debsources data set is the complete Debian source code that covers a large history of the Debian project and therefore includes thousands of free-software projects of different origins. According to the paper:
The long history of Debian creates a perfect subject to evaluate how FOSS licenses use has evolved over time, and the popularity of licenses currently in use.
Sullivan argued that the Debsources data set is interesting because of its quality: every package in Debian has been reviewed by multiple humans, including the original packager, but also by the FTP masters to ensure that the distribution can legally redistribute the software. The existence of a package in Debian provides a minimal "proof of use": unmaintained packages get removed from Debian on a regular basis and the mere fact that a piece of software gets packaged in Debian means at least some users found it important enough to work on packaging it. Debian packagers make specific efforts to avoid code duplication between packages in order to ease security maintenance. The data set covers a period longer than Black Duck's or GitHub's, as it goes all the way back to the Hamm 2.0 release in 1998. The data and how to reproduce it are freely available under a CC BY-SA 4.0 license.
[Debsource   graph]
Sullivan presented the above graph from the research paper that showed the evolution of software license use in the Debian archive. Whereas previous graphs showed statistics in percentages, this one showed actual absolute numbers, where we can't actually distinguish a decline in copyleft licenses. To quote the paper again:
The top license is, once again, GPL-2.0+, followed by: Artistic-1.0/GPL dual-licensing (the licensing choice of Perl and most Perl libraries), GPL-3.0+, and Apache-2.0.
Indeed, looking at the graph, at most do we see a rise of the Apache and MIT licenses and no decline of the GPL per se, although its adoption does seem to slow down in recent years. We should also mention the possibility that Debian's data set has the opposite bias: toward GPL software. The Debian project is culturally quite different from the GitHub community and even the larger free-software ecosystem, naturally, which could explain the disparity in the results. We can only hope a similar analysis can be performed on the much larger Software Heritage data set eventually, which may give more representative results. The paper acknowledges this problem:
Debian is likely representative of enterprise use of FOSS as a base operating system, where stable, long-term and seldomly updated software products are desirable. Conversely Debian is unlikely representative of more dynamic FOSS environments (e.g., modern Web-development with micro libraries) where users, who are usually developers themselves, expect to receive library updates on a daily basis.
The Debsources research also shares methodology limitations with Black Duck: while Debian packages are reviewed before uploading and we can rely on the copyright information provided by Debian maintainers, the research also relies on automated tools (specifically FOSSology) to retrieve license information. Sullivan also warned against "ascribing reason to numbers": people may have different reasons for choosing a particular license. Developers may choose the MIT license because it has fewer words, for compatibility reasons, or simply because "their lawyers told them to". It may not imply an actual deliberate philosophical or ideological choice. Finally, he brought up the theory that the rise of non-copyleft licenses isn't necessarily at the detriment of the GPL. He explained that, even if there is an actual decline, it may not be much of a problem if there is an overall growth of free software to the detriment of proprietary software. He reminded the audience that non-copyleft licenses are still free software, according to the FSF and the Debian Free Software Guidelines, so their rise is still a positive outcome. Even if the GPL is a better tool to accomplish the goal of a free-software world, we can all acknowledge that the conversion of proprietary software to more permissive and certainly simpler licenses is definitely heading in the right direction.
[I would like to thank the DebConf organizers for providing meals for me during the conference.] Note: this article first appeared in the Linux Weekly News.

25 February 2017

Stefano Zacchiroli: Software Freedom Conservancy matching

become a Conservancy supporter by February 28th and have your donation matched Non-profits that provide project support have proven themselves to be necessary for the success and advancement of individual projects and Free Software as a whole. The Free Software Foundation (founded in 1985) serves as a home to GNU projects and a canonical list of Free Software licenses. The Open Source Initiative came about in 1998, maintaining the Open Source Definition, based on the Debian Free Software Guidelines, with affiliate members including Debian, Mozilla, and the Wikimedia Foundation. Software in the Public Interest (SPI) was created in the late 90s largely to act as a fiscal sponsor for projects like Debian, enabling it to do things like accept donations and handle other financial transactions. More recently (2006), the Software Freedom Conservancy was formed. Among other activities like serving as a fiscal sponsor, infrastructure provider, and support organization for a number of free software projects including Git, Outreachy, and the Debian Copyright Aggregation Project they protect user freedom via copyleft compliance and GPL enforcement work. Without a willingness to act when licenses are violated, copyleft has no power. Through communication, collaboration, and only as last resort litigation, the Conservancy helps everyone who uses a freedom respecting license. The Conservancy has been aggressively fundraising in order to not just continue its current operations, but expand their work, staff, and efforts. They recently launched a donation matching campaign thanks to the generosity and dedication of an anonymous donor. Everyone who joins the Conservancy as a annual Supporter by February 28th will have their donation matched. A number of us are already supporters, and hope you will join us in supporting the world of an organization that supports us.

12 February 2017

Stefano Zacchiroli: Opening the Software Heritage archive

... one API (and one FOSDEM) at a time [ originally posted on the Software Heritage blog, reposted here with minor adaptations ] Last Saturday at FOSDEM we have opened up the public API of Software Heritage, allowing to programmatically browse its archive. We posted this while I was keynoting with Roberto at FOSDEM 2017, to discuss the role Software Heritage plays in preserving the Free Software commons. To accompany the talk we released our first public API, which allows to navigate the entire content of the Software Heritage archive as a graph of connected development objects (e.g., blobs, directories, commits, releases, etc.). Over the past months we have been busy working on getting source code (with full development history) into the archive, to minimize the risk that important bits of Free/Open Sources Software that are publicly available today disappear forever from the net, due to whatever reason --- crashes, black hat hacking, business decisions, you name it. As a result, our archive is already one of the largest collections of source code in existence, spanning a GitHub mirror, injections of important Free Software collections such as Debian and GNU, and an ongoing import of all Google Code and Gitorious repositories. Up to now, however, the archive was deposit-only. There was no way for the public to access its content. While there is a lot of value in archival per se, our mission is to Collect, Preserve, and Share all the material we collect with everybody. Plus, we totally get that a deposit-only library is much less exciting than a store-and-retrieve one! Last Saturday we took a first important step towards providing full access to the content of our archive: we released version 1 of our public API, which allows to navigate the Software Heritage archive programmatically. You can have a look at the API documentation for full details about how it works. But to briefly recap: conceptually, our archive is a giant Merkle DAG connecting together all development-related objects we encounter while crawling public VCS repositories, source code releases, and GNU/Linux distribution packages. Examples of the objects we store are: file contents, directories, commits, releases; as well as their metadata, such as: log messages, author information, permission bits, etc. The API we have just released allows to pointwise navigate this huge graph. Using the API you can lookup individual objects by their IDs, retrieve their metadata, and jump from one object to another following links --- e.g., from a commit to the corresponding directory or parent commits, from a release to the annotated commit, etc. Additionally, you can retrieve crawling-related information, such as the software origins we track (usually as VCS clone/checkout URLs), and the full list of visits we have done on any known software origin. This allows, for instance, to know when we took snapshots of a Git repository you care about and, for each visit, where each branch of the repo was pointing to at that time. Our resources for offering the API as a public service are still quite limited. This is the reason why you will encounter a couple of limitations. First, no download of the actual content of files we have stored is possible yet --- you can retrieve all content-related metadata (e.g., checksums, detected file types and languages, etc.), but not the actual content as a byte sequence. Second, some pretty severe rate limits apply; API access is entirely anonymous and users are identified by their IP address, each "user" will be able to do a little bit more than 100 requests/hour. This is to keep our infrastructure sane while we grow in capacity and focus our attention to developing other archive features. If you're interested in having rate limits lifted for a specific use case or experiment, please contact us and we will see what we can do to help. If you'd like to contribute to increase our resource pool, have a look at our sponsorship program!

28 November 2016

Stefano Zacchiroli: last week to take part in the Debian Contributors Survey

Debian Contributors Survey 2016 About 3 weeks ago, together with Molly and Mathieu, we launched the first edition of the Debian Contributors Survey. I won't harp on it any further, because you can find all relevant information about it on the Debian blog or as part of the original announcement. But it's worth noting that you've now only one week left to participate if you want to: the deadline for participation is 4 December 2016, at 23:59 UTC. If you're a Debian contributor and would like to participate, just go to the survey participation page and fill in!

16 November 2016

Bits from Debian: Debian Contributors Survey 2016

The Debian Contributor Survey launched last week! In order to better understand and document who contributes to Debian, we (Mathieu ONeil, Molly de Blanc, and Stefano Zacchiroli) have created this survey to capture the current state of participation in the Debian Project through the lense of common demographics. We hope a general survey will become an annual effort, and that each year there will also be a focus on a specific aspect of the project or community. The 2016 edition contains sections concerning work, employment, and labour issues in order to learn about who is getting paid to work on and with Debian, and how those relationships affect contributions. We want to hear from as many Debian contributors as possible whether you've submitted a bug report, attended a DebConf, reviewed translations, maintain packages, participated in Debian teams, or are a Debian Developer. Completing the survey should take 10-30 minutes, depending on your current involvement with the project and employment status. In an effort to reflect our own ideals as well as those of the Debian project, we are using LimeSurvey, an entirely free software survey tool, in an instance of it hosted by the LimeSurvey developers. Survey responses are anonymous, IP and HTTP information are not logged, and all questions are optional. As it is still likely possible to determine who a respondent is based on their answers, results will only be distributed in aggregate form, in a way that does not allow deanonymization. The results of the survey will be analyzed as part of ongoing research work by the organizers. A report discussing the results will be published under a DFSG-free license and distributed to the Debian community as soon as it's ready. The raw, disaggregated answers will not be distributed and will be kept under the responsibility of the organizers. We hope you will fill out the Debian Contributor Survey. The deadline for participation is: 4 December 2016, at 23:59 UTC. If you have any questions, don't hesitate to contact us via email at:

18 August 2016

Zlatan Todori : DebConf16 - new age in Debian community gathering

DebConf16 Finally got some time to write this blog post. DebConf for me is always something special, a family gathering of weird combination of geeks (or is weird a default geek state?). To be honest, I finally can compare Debian as hacker conference to other so-called hacker conferences. With that hat on, I can say that Debian is by far the most organized and highest quality conference. Maybe I am biased, but I don't care too much about that. I simply love Debian and that is no secret. So lets dive into my view on DebConf16 which was held in Cape Town, South Africa. Cape Town This was the first time we had conference on African continent (and I now see for the first time DebConf bid for Asia, which leaves only Australia and beautiful Pacific islands to start a bid). Cape Town by itself, is pretty much Europe-like city. That was kinda a bum for me on first day, especially as we were hosted at University of Cape Town (which is quite beautiful uni) and the surrounding neighborhood was very European. Almost right after the first day I was fine because I started exploring the huge city. Cape Town is really huge, it has by stats ~4mil people, and unofficially it has ~6mil. Certainly a lot to explore and I hope one day to be back there (I actually hope as soon as possible). The good, bad and ugly I will start with bad and ugly as I want to finish with good notes. Racism down there is still HUGE. You don't have signs on the road saying that, but there is clearly separation between white and black people. The houses near uni all had fences on walls (most of them even electrical ones with sharp blades on it) with bars on windows. That just bring tensions and certainly doesn't improve anything. To be honest, if someone wants to break in they still can do easily so the fences maybe need to bring intimidation but they actually only bring tension (my personal view). Also many houses have sign of Armed Force Response (something in those lines) where in case someone would start breaking in, armed forces would come to protect the home. Also compared to workforce, white appear to hold most of profit/big business positions and fields, while black are street workers, bar workers etc etc. On the street you can feel from time to time the tension between people. Going out to bars also showed the separation - they were either almost exclusively white or exclusively black. Very sad state to see. Sharing love and mixing is something that pushes us forward and here I saw clear blockades for such things. The bad part of Cape Town is, and this is not only special to Cape Town but to almost all major cities, is that small crime is on wide scale. Pickpocketing here is something you must pay attention to it. To me, personally, nothing happened but I heard a lot of stories from my friends on whom were such activities attempted (although I am not sure did the criminals succeed). Enough of bad as my blog post will not change this and it is a topic for debate and active involvement which I can't unfortunately do at this moment. THE GOOD! There are so many great local people I met! As I mentioned, I want to visit that city again and again and again. If you don't fear of those bad things, this city has great local cuisine, a lot of great people, awesome art soul and they dance with heart (I guess when you live in rough times, you try to use free time at your best). There were difference between white and black bars/clubs - white were almost like standard European, a lot of drinking and not much dancing, and black were a lot of dancing and not much drinking (maybe the economical power has something to do with it but I certainly felt more love in black bars). Cape Town has awesome mountain, the Table Mountain. I went on hiking with my friends, and I must say (again to myself) - do the damn hiking as much as possible. After every hike I feel so inspired, that I will start thinking that I hate myself for not doing it more often! The view from Table mountain is just majestic (you can even see the Cape of Good Hope). The WOW moments are just firing up in you. Now lets transfer to DebConf itself. As always, organization was on quite high level. I loved the badge design, it had a map and nice amount of information on it. The place we stayed was kinda not that good but if you take it into account that those a old student dorms (in we all were in female student dorm :D ) it is pretty fancy by its own account. Talks were near which is always good. The general layout of talks and front desk position was perfect in my opinion. All in one place basically. Wine and Cheese this year was kinda funny story because of the cheese restrictions but Cheese cabal managed to pull out things. It was actually very well organized. Met some new people during the party/ceremony which always makes me grow as a person. Cultural mix on DebConf is just fantastic. Not only you learn a lot about Debian, hacking on it, but sheer cultural diversity makes this small con such a vibrant place and home to a lot. Debian Dinner happened in Aquarium were I had nice dinner and chat with my old friends. Aquarium by itself is a thing where you can visit and see a lot of strange creatures that live on this third rock from Sun. Speaking of old friends - I love that I Apollo again rejoined us (by missing the DebConf15), seeing Joel again (and he finally visited Banja Luka as aftermath!), mbiebl, ah, moray, Milan, santiago and tons of others. Of course we always miss a few such as zack and vorlon this year (but they had pretty okay-ish reasons I would say). Speaking of new friends, I made few local friends which makes me happy and at least one Indian/Hindu friend. Why did I mention this separately - well we had an accident during Group Photo (btw, where is our Lithuanian, German based nowdays, photographer?!) where 3 laptops of our GSoC students were stolen :( . I was luckily enough to, on behalf of Purism, donate Librem11 prototype to one of them, which ended up being the Indian friend. She is working on real time communications which is of interest also to Purism for our future projects. Regarding Debian Day Trip, Joel and me opted out and we went on our own adventure through Cape Town in pursue of meeting and talking to local people, finding out interesting things which proved to be a great decision. We found about their first Thursday of month festival and we found about Mama Africa restaurant. That restaurant is going into special memories (me playing drums with local band must always be a special memory, right?!). Huh, to be honest writing about DebConf would probably need a book by itself and I always try to keep my posts as short as possible so I will try to stop here (maybe I write few bits in future more about it but hardly). Now the notes. Although I saw the racial segregation, I also saw the hope. These things need time. I come from country that is torn apart in nationalism and religious hate so I understand this issues is hard and deep on so many levels. While the tensions are high, I see people try to talk about it, try to find solution and I feel it is slowly transforming into open society, where we will realize that there is only one race on this planet and it is called - HUMAN RACE. We are all earthlings, and as sooner we realize that, sooner we will be on path to really build society up and not fake things that actually are enslaving our minds. I just want in the end to say thank you DebConf, thank you Debian and everyone could learn from this community as a model (which can be improved!) for future societies.

16 August 2016

Lars Wirzenius: 20 years ago I became a Debian developer

Today it is 23 years ago since Ian Murdock published his intention to develop a new Linux distribution, Debian. It also about 20 years since I became a Debian developer and made my first package upload. In the time since: It's been a good twenty years. And the fun ain't over yet.

8 February 2016

Orestis Ioannou: Debian - your patches and machine readable copyright files are available on Debsources

TL;DR All Debian license and patches are belong to us. Discover them here and here. In case you hadn't already stumbled upon sources.debian.net in the past, Debsources is a simple web application that allows to publish an unpacked Debian source mirror on the Web. On the live instance you can browse the contents of Debian source packages with syntax highlighting, search files matching a SHA-256 hash or a ctag, query its API, highlight lines, view accurate statistics and graphs. It was initially developed at IRILL by Stefano Zacchiroli and Matthieu Caneill. During GSOC 2015 I helped introduce two new features. License Tracker Since Debsources has all the debian/copyright files and that many of them adopted the DEP-5 suggestion (machine readable copyright files) it was interesting to exploit them for end users. You may find interesting the following features: Have a look at the documentation to discover more! Patch tracker The old patch tracker unfortunately died a while ago. Since Debsources stores all the patches it was, probably, natural for it to be able to exploit them and present them over the web. You can navigate through packages by prefix or by searching them here. Among the use cases: Read more about the API! Coming ... I hope you find these new features useful. Don't hesitate to report any bugs or suggestions you come accross.

1 February 2016

Stefano Zacchiroli: guest lecture Overthrowing the Tyranny of Software by John Sullivan

As part of my master class on Free and Open Source (FOSS) Software at University Paris Diderot, I invite guest lecturers to present to my students the point of views of various actors of the FOSS ecosystem --- companies, non-profits, activists, lawyers, etc. Tomorrow, Tuesday 2 February 2016, the students will have the pleasure to have as guest lecturer John Sullivan, Executive Director of the Free Software Foundation, talking about Overthrowing the tyranny of software: Why (and how) free societies respect computer user freedom. The lecture is open to everyone interested, but registration is recommended. Logistic and registration information, as well as the lecture abstract in both English and French is reported below.
John Sullivan's Lecture at University Paris Diderot - Overthrowing the tyranny of software: Why (and how) free societies respect computer user freedom John Sullivan, Executive Director of the Free Software Foundation will give a lecture titled "Overthrowing the tyranny of software: Why (and how) free societies respect computer user freedom" at University Paris Diderot next Tuesday, 2 February 2016, at 12:30 in the Amphi 3B, Halle aux Farines building, Paris 75013. Map at: http://www.openstreetmap.org/way/62378611#map=19/48.82928/2.38183 The lecture will be in English and open to everyone, but registration is recommended at https://framadate.org/iPqfjNTz2535F8u4 or via email writing to zack@pps.univ-paris-diderot.fr. Abstract: Anyone who has used a computer for long has at least sometimes felt like a helpless subject under the tyrant of software, screaming (uselessly) in frustration at the screen to try and get the desired results. But with driverless cars, appliances which eavesdrop on conversations in our homes, mobile devices that transmit our location when we are out and about, and computers with unexpected hidden "features", our inability to control the software supposedly in our possession has become a much more serious problem than the superficial blue-screen-of-death irritations of the past. Software which is free "as in freedom" allows anyone who has it to inspect the code and even modify it -- or ask someone trained in the dark arts of computer programming to do it for them -- so that undesirable behaviors can be removed or defused. This characteristic, applied to all software, should be a major part of foundation of free societies moving forward. To get there, we'll need individual developers, nonprofit organizations, governments, and companies all working together -- with the first two groups leading the way.
Cours Magistral de John Sullivan l'Universit Paris Diderot - Surmonter la tyrannie du logiciel: pourquoi (et comment) les soci t s libres respectent les libert s des utilisateurs John Sullivan, Directeur Ex cutif de la Free Software Foundation donnera un cours magistral ayant pour titre "Surmonter la tyrannie du logiciel: pourquoi (et comment) les soci t s libres respectent les libert s des utilisateurs" l'Universit Paris Diderot Mardi prochain, 2 f vrier 2016, 12h30 dans l'Amphi 3B de la Halle aux Farines, Paris 75013. Plan: http://www.openstreetmap.org/way/62378611#map=19/48.82928/2.38183 Le cours (en langue Anglaise) sera ouvert toutes et tous, mais l'inscription est recommand via le formulaire https://framadate.org/iPqfjNTz2535F8u4 ou par mail l'adresse zack@pps.univ-paris-diderot.fr. R sum : Chacun de nous, au moins une fois dans sa vie, a pest contre son ordinateur dans l'espoir (vain) d'obtenir un r sultat attendu, se sentant d poss de par un tyran logiciel. Mais au jour d'aujourd'hui - avec des voitures autonomes, des dispositifs "intelligents" que nous coutent chez nous, des portables qui transmettent notre position quand nous nous baladons, et des ordinateurs pleins des fonctionnalit s cach es - notre incapacit de contr ler nos biens devient une question beaucoup plus s rieuse par rapport a l'irritation qu'auparavant nous causait l' cran bleu de la mort. Le logiciel libre permet chaque utilisateur d' tudier son fonctionnement et de le modifier --- ou de demander des experts dans la magie noire de la programmation de le faire a sa place --- supprimant, ou du moins r duisant, les comportements ind sir s du logiciel. Cette caract ristique du logiciel libre devrait tre appliqu e chaque type de logiciel et devrait constituer un pilier des soci t s se pr tendant libres. Pour achever cet id al, d veloppeurs, organisations but non lucratif, gouvernements et entreprises doivent travailler ensemble. Et les d veloppeurs et les ONG doivent se positionner au premier rang dans ce combat.

29 December 2015

Stefano Zacchiroli: Shuttleworth Foundation Flash Grant - 2015 report

1 year of Shuttleworth Foundation Flash Grant As announced last year, starting January 2015 I've benefited from a "Flash Grant" kindly awarded to me by the Shuttleworth Foundation. This post reports publicly about how I've used the money to promote Free Software via my own activism, over the period January-December 2015. I'm lucky to have a full-time academic job that provides me with a salary and basic computer hardware. But Free Software not being the only focus of my job, it gets difficult at times to get travel funding to specific Free Software events. So that is what I've mostly used the grant money for: attend Free Software events that I wouldn't have been able to attend otherwise. On grant money I've attended LibrePlanet 2015 (2015-03-19-boston-libreplanet label in the financial reports below), where I've given the talk Distributions and the Free "Cloud", and FSFE's LLW 2015 (2015-04-15-barcelona-fsfe-legal) workshop. Furthermore I've used the grant to reimburse otherwise not reimbursed out of pocket expenses in a trip to San Francisco (2015-11-06-san-francisco-gsoc+osi) that have been otherwise sponsored by Google (to attend the Summer of Code Mentor Summit) and OSI (to attend a F2F meeting of the Board of Directors). Finally, I've used grant money to offer lunch to invited lecturers in my master-level Free Software class at the university (label 2015-foss-class). Actual financial reports are reported below, in ledger format. It should be noted that, contrary to the usual expected 6-month duration of flash grants, I've used only about half the grant amount over a 12-month period; I do not plan to pocket what remains, but rather keep on using it over the next year, reporting again publicly at the end of the period. Also, I did not breakdown further out of pocket expenses, but they invariably stand for public transport tickets and meals. Balance sheet Overall:
         1966,11 EUR  Assets:Funds
        -4052,52 EUR  Equity:Opening balances
         2086,41 EUR  Expenses
           15,90 EUR    Bank:Commissions
          424,00 EUR    Conference:Registration
           56,50 EUR    Teaching:Speaker-invitation
         1590,01 EUR    Travel
          249,02 EUR      Lodgement
          562,51 EUR      Out-of-pocket
          778,48 EUR      Plane
--------------------
                   0

Breakdown by purpose: Journal
2014-12-03 Shuttleworth Foundation flash grant                                    Equity:Opening balances                     -4052,52 EUR    -4052,52 EUR
                                                                                  Assets:Funds                                 4052,52 EUR               0
2014-12-04 bank commissions on incoming transfer                                  Expenses:Bank:Commissions                      15,90 EUR       15,90 EUR
                                                                                  Assets:Funds                                  -15,90 EUR               0
2014-12-24 plane tickets Paris-Boston round trip to attend LibrePlanet 2015       Expenses:Travel:Plane                         627,84 EUR      627,84 EUR
                                                                                  Assets:Funds                                 -627,84 EUR               0
2015-01-02 LibrePlanet 2015 registration + travel fund contribution               Expenses:Conference:Registration              424,00 EUR      424,00 EUR
                                                                                  Assets:Funds                                 -424,00 EUR               0
2015-03-02 plane tickets Paris-Barcelona round trip to attend LLW 2015            Expenses:Travel:Plane                         150,64 EUR      150,64 EUR
                                                                                  Assets:Funds                                 -150,64 EUR               0
2015-03-19 lunch with invited speaker for lecture about FOSS release management   Expenses:Teaching:Speaker-invitation           28,00 EUR       28,00 EUR
                                                                                  Assets:Funds                                  -28,00 EUR               0
2015-03-25 lunch with invited speaker for lecture about FOSS business models      Expenses:Teaching:Speaker-invitation           28,50 EUR       28,50 EUR
                                                                                  Assets:Funds                                  -28,50 EUR               0
2015-04-03 LibrePlanet 2015 out of pocket expenses                                Expenses:Travel:Out-of-pocket                 213,38 EUR      213,38 EUR
                                                                                  Assets:Funds                                 -213,38 EUR               0
2015-04-15 LLW 2015 out of pocket expenses                                        Expenses:Travel:Out-of-pocket                  80,00 EUR       80,00 EUR
                                                                                  Assets:Funds                                  -80,00 EUR               0
2015-05-06 hotel in Barcelona for LLW 2015 (3 nights)                             Expenses:Travel:Lodgement                     249,02 EUR      249,02 EUR
                                                                                  Assets:Funds                                 -249,02 EUR               0
2015-11-29 OSI F2F Fall 2015 out of pocket expenses                               Expenses:Travel:Out-of-pocket                 269,13 EUR      269,13 EUR
                                                                                  Assets:Funds                                 -269,13 EUR               0

24 July 2015

Martin Michlmayr: Congratulations to Stefano Zacchiroli

Stefano Zacchiroli receiving the O'Reilly Open Source Award I attended OSCON's closing sessions today and was delighted to see my friend Stefano Zacchiroli (Zack) receive an O'Reilly Open Source Award. Zack acted as Debian Project Leader for three years, is working on important activities at the Open Source Initiative and the Free Software Foundation, and is generally an amazing advocate for free software. Thanks for all your contributions, Zack, and congratulations!

Next.