Search Results: "Andreas Beckmann"

10 June 2023

Freexian Collaborators: Debian Contributions: /usr-merge updates, tox 4 transition, and more! (by Utkarsh Gupta, Stefano Rivera)

Contributing to Debian is part of Freexian s mission. This article covers the latest achievements of Freexian and their collaborators. All of this is made possible by organizations subscribing to our Long Term Support contracts and consulting services.

/usr-merge, by Helmut Grohne, et al Towards the end of April, the discussion on DEP 17 on debian-devel@l.d.o initiated by Helmut Grohne took off, trying to deal with the fact that while Debian bookworm has a merged /usr, files are still being distributed to / and /usr in Debian binary packages, and moving them currently has some risk of breakage. Most participants of the discussion agreed that files should be moved, and there are several competing design ideas for doing it safely. Most of the time was spent understanding the practical implications of lifting the moratorium and moving all the files from / to /usr in a coordinated effort. With help from Emilio Pozuelo Monfort, Enrico Zini, and Raphael Hertzog, Helmut Grohne performed extensive analysis of the various aspects, including quantitative analysis of the original file move problem, analysis of effects on dpkg-divert, dpkg-statoverride, and update-alternatives, analysis of effects on filesystem bootstrapping tools. Most of the problematic cases spawned plausible workarounds, such as turning Breaks into Conflicts in selected cases or adding protective diversions for the symbolic links that enable aliasing. Towards the end of May, Andreas Beckmann reported a new failure scenario which may cause shared resources to inadvertently disappear, such as directories and even regular files in case of Multi-Arch packages, and our work on analyzing these problems and proposing mitigations is on-going. While the quantitative analysis is funded by Freexian, we wouldn t be here without the extensive feedback and ideas of many voluntary contributors from multiple areas of Debian, which are too many to name here. Thank you.

Preparing for the tox 4 transition, by Stefano Rivera While Debian was in freeze for the bookworm release, tox 4 has landed in Debian experimental, and some packages are starting to require it, upstream. It has some backwards-incompatible behavior that breaks many packages using tox through pybuild. So Stefano had to make some changes to pybuild and to many packages that run build-time tests with tox. The easy bits of this transition are now completed in git / experimental, but a few packages that integrate deeply into tox need upstream work.

Debian Printing, by Thorsten Alteholz Just before the release of Bookworm, lots of QA tools were used to inspect packages. One of these tools found a systemd service file in a wrong directory. So, Thorsten did another upload of package lprint to correct this. Thanks a lot to all the hardworking people who run such tools and file bugs. Thorsten also participated in discussions about the new Common Printing Dialog Backends (CPDB) that will be introduced in Trixie and hopefully can replace the current printing architecture in Forky.

Miscellaneous contributions
  • DebConf 23 preparations by Stefano Rivera. Some work on the website, video team planning, accounting, and team documentation.
  • Utkarsh Gupta started to prep the work on the bursary team s side for DC23.
  • Stefano spun up a website for the Hamburg mini-DebConf so that the video team could have a machine-readable schedule and a place to stream video from the event.
  • Santiago Ruano Rinc n reviewed and sponsored four python packages of a prospective Debian member.
  • Helmut Grohne supported Timo Roehling and Jochen Sprickerhof to improve cross building in 15 ROS packages.
  • Helmut Grohne supported Jochen Sprickerhof with diagnosing an e2fsprogs RC bug.
  • Helmut Grohne continued to maintain rebootstrap and located an issue with lto in gcc-13.
  • Anton Gladky fixed some RC-Bugs and uploaded a new stravalib python library.

20 November 2017

Reproducible builds folks: Reproducible Builds: Weekly report #133

Here's what happened in the Reproducible Builds effort between Sunday November 5 and Saturday November 11 2017: Upcoming events On November 17th Chris Lamb will present at Open Compliance Summit, Yokohama, Japan on how reproducible builds ensures the long-term sustainability of technology infrastructure. We plan to hold an assembly at 34C3 - hope to see you there! LEDE CI tests Thanks to the work of lynxis, Mattia and h01ger, we're now testing all LEDE packages in our setup. This is our first result for the ar71xx target: "502 (100.0%) out of 502 built images and 4932 (94.8%) out of 5200 built packages were reproducible in our test setup." - see below for details how this was achieved. Bootstrapping and Diverse Double Compilation As a follow-up of a discussion on bootstrapping compilers we had on the Berlin summit, Bernhard and Ximin worked on a Proof of Concept for Diverse Double Compilation of tinycc (aka tcc). Ximin Luo did a successful diverse-double compilation of tinycc git HEAD using gcc-7.2.0, clang-4.0.1, icc-18.0.0 and pgcc-17.10-0 (pgcc needs to triple-compile it). More variations are planned for the future, with the eventual aim to reproduce the same binaries cross-distro, and extend it to test GCC itself. Packages reviewed and fixed, and bugs filed Patches filed upstream: Patches filed in Debian: Patches filed in OpenSUSE: Reviews of unreproducible packages 73 package reviews have been added, 88 have been updated and 40 have been removed in this week, adding to our knowledge about identified issues. 4 issue types have been updated: Weekly QA work During our reproducibility testing, FTBFS bugs have been detected and reported by: diffoscope development Mattia Rizzolo uploaded version 88~bpo9+1 to stretch-backports. reprotest development reproducible-website development theunreproduciblepackage development tests.reproducible-builds.org in detail Misc. This week's edition was written by Ximin Luo, Bernhard M. Wiedemann, Chris Lamb and Holger Levsen & reviewed by a bunch of Reproducible Builds folks on IRC & the mailing lists.

3 October 2017

Reproducible builds folks: Reproducible Builds: Weekly report #127

Here's what happened in the Reproducible Builds effort between Sunday September 24 and Saturday September 30 2017: Development and fixes in key packages Kai Harries did an initial packaging of the Nix package manager for Debian. You can track his progress in #877019. Uploads in Debian: Packages reviewed and fixed, and bugs filed Patches sent upstream: Reproducible bugs (with patches) filed in Debian: QA bugs filed in Debian: Reviews of unreproducible packages 103 package reviews have been added, 153 have been updated and 78 have been removed in this week, adding to our knowledge about identified issues. Weekly QA work During our reproducibility testing, FTBFS bugs have been detected and reported by: diffoscope development Mattia Rizzolo uploaded version 87 to stretch-backports. strip-nondeterminism development reprotest development tests.reproducible-builds.org reproducible-website development Misc. This week's edition was written by Ximin Luo, Bernhard M. Wiedemann, Holger Levsen and Chris Lamb & reviewed by a bunch of Reproducible Builds folks on IRC & the mailing lists.

14 August 2017

Reproducible builds folks: Reproducible Builds: Weekly report #119

Here's what happened in the Reproducible Builds effort between Sunday July 30 and Saturday August 5 2017: Media coverage We were mentioned on Late Night Linux Episode 17, around 29:30. Packages reviewed and fixed, and bugs filed Upstream packages: Debian packages: Reviews of unreproducible packages 29 package reviews have been added, 72 have been updated and 151 have been removed in this week, adding to our knowledge about identified issues. 4 issue types have been updated: Weekly QA work During our reproducibility testing, FTBFS bugs have been detected and reported by: diffoscope development Version 85 was uploaded to unstable by Mattia Rizzolo. It included contributions from: as well as previous weeks' contributions, summarised in the changelog. There were also further commits in git, which will be released in a later version: Misc. This week's edition was written by Ximin Luo, Bernhard M. Wiedemann and Chris Lamb & reviewed by a bunch of Reproducible Builds folks on IRC & the mailing lists.

12 July 2017

Reproducible builds folks: Reproducible Builds: week 115 in Stretch cycle

Here's what happened in the Reproducible Builds effort between Sunday July 2 and Saturday July 8 2017: Reproducible work in other projects Ed Maste pointed to a thread on the LLVM developer mailing list about container iteration being the main source of non-determinism in LLVM, together with discussion on how to solve this. Ignoring build path issues, container iteration order was also the main issue with rustc, which was fixed by using a fixed-order hash map for certain compiler structures. (It was unclear from the thread whether LLVM's builds are truly path-independent or rather that they haven't done comparisons between builds run under different paths.) Bugs filed Patches submitted upstream: Reviews of unreproducible packages 52 package reviews have been added, 62 have been updated and 20 have been removed in this week, adding to our knowledge about identified issues. No issue types were updated or added this week. Weekly QA work During our reproducibility testing, FTBFS bugs have been detected and reported by: diffoscope development Development continued in git with contributions from: With these changes, we are able to generate a dynamically loaded HTML diff for GCC-6 that can be displayed in a normal web browser. For more details see this mailing list post. Misc. This week's edition was written by Ximin Luo, Bernhard M. Wiedemann and Chris Lamb & reviewed by a bunch of Reproducible Builds folks on IRC & the mailing lists.

17 January 2016

Lunar: Reproducible builds: week 38 in Stretch cycle

What happened in the reproducible builds effort between January 10th and January 16th:

Toolchain fixes Benjamin Drung uploaded mozilla-devscripts/0.43 which sorts the file list in preferences files. Original patch by Reiner Herrmann. Lunar submitted an updated patch series to make timestamps in packages created by dpkg deterministic. To ensure that the mtimes in data.tar are reproducible, with the patches, dpkg-deb uses the --clamp-mtime option added in tar/1.28-1 when available. An updated package has been uploaded to the experimental repository. This removed the need for a modified debhelper as all required changes for reproducibility have been merged or are now covered by dpkg.

Packages fixed The following packages have become reproducible due to changes in their build dependencies: angband-doc, bible-kjv, cgoban, gnugo, pachi, wmpuzzle, wmweather, wmwork, xfaces, xnecview, xscavenger, xtrlock, virt-top. The following packages became reproducible after getting fixed: Some uploads fixed some reproducibility issues, but not all of them: Untested changes:

reproducible.debian.net Once again, Vagrant Cascadian is providing another armhf build system, allowing to run 6 more armhf builder jobs, right there. (h01ger) Stop requiring a modified debhelper and adapt to the latest dpkg experimental version by providing a predetermined identifier for the .buildinfo filename. (Mattia Rizzolo, h01ger) New X.509 certificates were set up for jenkins.debian.net and reproducible.debian.net using Let's Encrypt!. Thanks to GlobalSign for providing certificates for the last year free of charge. (h01ger)

Package reviews 131 reviews have been removed, 85 added and 32 updated in the previous week. FTBFS issues filled: 29. Thanks to Chris Lamb, Mattia Rizzolo, and Niko Tyni. New issue identified: timestamps_in_manpages_added_by_golang_cobra.

Misc. Most of the minutes from the meetings held in Athens in December 2015 are now available to the public.

18 October 2015

Lunar: Reproducible builds: week 25 in Stretch cycle

What happened in the reproducible builds effort this week: Toolchain fixes Niko Tyni wrote a new patch adding support for SOURCE_DATE_EPOCH in Pod::Man. This would complement or replace the previously implemented POD_MAN_DATE environment variable in a more generic way. Niko Tyni proposed a fix to prevent mtime variation in directories due to debhelper usage of cp --parents -p. Packages fixed The following 119 packages became reproducible due to changes in their build dependencies: aac-tactics, aafigure, apgdiff, bin-prot, boxbackup, calendar, camlmix, cconv, cdist, cl-asdf, cli-common, cluster-glue, cppo, cvs, esdl, ess, faucc, fauhdlc, fbcat, flex-old, freetennis, ftgl, gap, ghc, git-cola, globus-authz-callout-error, globus-authz, globus-callout, globus-common, globus-ftp-client, globus-ftp-control, globus-gass-cache, globus-gass-copy, globus-gass-transfer, globus-gram-client, globus-gram-job-manager-callout-error, globus-gram-protocol, globus-gridmap-callout-error, globus-gsi-callback, globus-gsi-cert-utils, globus-gsi-credential, globus-gsi-openssl-error, globus-gsi-proxy-core, globus-gsi-proxy-ssl, globus-gsi-sysconfig, globus-gss-assist, globus-gssapi-error, globus-gssapi-gsi, globus-net-manager, globus-openssl-module, globus-rsl, globus-scheduler-event-generator, globus-xio-gridftp-driver, globus-xio-gsi-driver, globus-xio, gnome-control-center, grml2usb, grub, guilt, hgview, htmlcxx, hwloc, imms, kde-l10n, keystone, kimwitu++, kimwitu-doc, kmod, krb5, laby, ledger, libcrypto++, libopendbx, libsyncml, libwps, lprng-doc, madwimax, maria, mediawiki-math, menhir, misery, monotone-viz, morse, mpfr4, obus, ocaml-csv, ocaml-reins, ocamldsort, ocp-indent, openscenegraph, opensp, optcomp, opus, otags, pa-bench, pa-ounit, pa-test, parmap, pcaputils, perl-cross-debian, prooftree, pyfits, pywavelets, pywbem, rpy, signify, siscone, swtchart, tipa, typerep, tyxml, unison2.32.52, unison2.40.102, unison, uuidm, variantslib, zipios++, zlibc, zope-maildrophost. The following packages became reproducible after getting fixed: Packages which could not be tested: Some uploads fixed some reproducibility issues but not all of them: Patches submitted which have not made their way to the archive yet: Lunar reported that test strings depend on default character encoding of the build system in ongl. reproducible.debian.net The 189 packages composing the Arch Linux core repository are now being tested. No packages are currently reproducible, but most of the time the difference is limited to metadata. This has already gained some interest in the Arch Linux community. An explicit log message is now visible when a build has been killed due to the 12 hours timeout. (h01ger) Remote build setup has been made more robust and self maintenance has been further improved. (h01ger) The minimum age for rescheduling of already tested amd64 packages has been lowered from 14 to 7 days, thanks to the increase of hardware resources sponsored by ProfitBricks last week. (h01ger) diffoscope development diffoscope version 37 has been released on October 15th. It adds support for two new file formats (CBFS images and Debian .dsc files). After proposing the required changes to TLSH, fuzzy hashes are now computed incrementally. This will avoid reading entire files in memory which caused problems for large packages. New tests have been added for the command-line interface. More character encoding issues have been fixed. Malformed md5sums will now be compared as binary files instead of making diffoscope crash amongst several other minor fixes. Version 38 was released two days later to fix the versioned dependency on python3-tlsh. strip-nondeterminism development strip-nondeterminism version 0.013-1 has been uploaded to the archive. It fixes an issue with nonconformant PNG files with trailing garbage reported by Roland Rosenfeld. disorderfs development disorderfs version 0.4.1-1 is a stop-gap release that will disable lock propagation, unless --share-locks=yes is specified, as it still is affected by unidentified issues. Documentation update Lunar has been busy creating a proper website for reproducible-builds.org that would be a common location for news, documentation, and tools for all free software projects working on reproducible builds. It's not yet ready to be published, but it's surely getting there. Homepage of the future reproducible-builds.org website  Who's involved?  page of the future reproducible-builds.org website Package reviews 103 reviews have been removed, 394 added and 29 updated this week. 72 FTBFS issues were reported by Chris West and Niko Tyni. New issues: random_order_in_static_libraries, random_order_in_md5sums.

14 October 2015

Lunar: Reproducible builds: week 24 in Stretch cycle

What happened in the reproducible builds effort this week: Toolchain fixes Scott Kitterman fixed an issue with non-deterministic Depends generated by dh-python identified by Santiago Vila and Chris Lamb. Lunar updated the patch against dpkg which makes the order of files in control.tar.gz deterministic using the new --sort=name option available in GNU Tar 1.28. josch released sbuild version 0.66.0-1 with several fixes and improvements. The most notable one for reproducible builds is the new --build-path option and $build_path configuration variable added by akira which allows to explicitly chose a given build path. Reiner Herrmann wrote a new patch for dh-systemd to sort the list of unit files in the generated maintainer scripts. Packages fixed The following packages became reproducible due to changes in their build dependencies: aoeui, apron, camlmix, cudf, findlib, glpk-java, hawtjni, haxe, java-atk-wrapper, llvm-py, misery, mtasc, ocamldsort, optcomp, spamoracle. The following packages became reproducible after getting fixed: Some uploads fixed some reproducibility issues but not all of them: Untested Patches submitted which have not made their way to the archive yet: reproducible.debian.net ProfitBricks once again increased their support for reproducible builds in Debian and in other free software projects by adding 58 new cores and 138 GiB of RAM to the already existing setup. Two new amd64 build nodes and 16 new amd64 build jobs have been added which doubles the build capacity per day and allows us to spot many kind of problems earlier. The size of the tmpfs where builds are performed has also been increased from 70 to 200 GiB on all amd64 build nodes. Huge thanks! When examining a package, a link now points to a table listing all previous recorded tests for the same package. (Mattia) The menu on the package pages has also been improved. (h01ger) Packages in the depwait state are now rescheduled automatically after five days. (h01ger) Links to documentation and other projects being tested have been made more visible on the landing page. (h01ger) To reduce noise on the team IRC channel five different types of notifications have been turned into mail notifications. The remaining ones have been shortened and the status changes have been limited to unstable and experimental. (h01ger) Maintainer notifications about status changes in a package will only be sent out once per day, and not on each status change. (h01ger) diffoscope development Some more experiments of concurrent processing have been made. None were good and reliable enough to be shared, though. Package reviews 48 reviews have been removed, 189 added and 23 updated this week. 9 FTBFS bugs were reported by Chris Lamb. Misc. h01ger met with Levente Polyak to discuss testing Arch Linux on Debian continuous test system with an easily extensible framework. The idea is to also allow testing of other distributions, and provide a nice package based view like the one for Debian.

27 September 2015

Lunar: Reproducible builds: week 22 in Stretch cycle

What happened in the reproducible builds effort this week: Toolchain fixes Packages fixed The following 22 packages became reproducible due to changes in their build dependencies: breathe, cdi-api, geronimo-jpa-2.0-spec, geronimo-validation-1.0-spec, gradle-propdeps-plugin, jansi, javaparser, libjsr311-api-java, mac-widgets, mockito, mojarra, pastescript, plexus-utils2, powerline, python-psutil, python-sfml, python-tldap, pythondialog, tox, trident, truffle, zookeeper. The following packages became reproducible after getting fixed: Some uploads fixed some reproducibility issues but not all of them: Patches submitted which have not made their way to the archive yet: diffoscope development The changes to make diffoscope run under Python 3, along with many small fixes, entered the archive with version 35 on September 21th. Another release was made the very next day fixed two encoding-related issues discovered when running diffoscope on more Debian packages. strip-nondeterminism development Version 0.12.0 now preserves file permissions on modified zip files and dh_strip_nondeterminism has been made compatible with older debhelper. disorderfs development Version 0.3.0 implemented a multi-user mode that was required to build Debian packages using disorderfs. It also added command line options to control the ordering of files in directory (either shuffled or reversed) and another to do arbitrary changes to the reported space used by files on disk. A couple days later, version 0.4.0 was released to support locks, flush, fsync, fsyncdir, read_buf, and write_buf. Almost all known issues have now been fixed. reproducible.debian.net disorderfs is now used during the second build. This makes file ordering issue very easy to identify as such. (h01ger) Work has been done on making the distributed build setup more reliable. (h01ger) Documentation update Matt Kraii fixed the example on how to fix issues related to dates in Sphinx. Recent Sphinx versions should also be compatible with SOURCE_DATE_EPOCH. Package reviews 53 reviews have been removed, 85 added and 13 updated this week. 46 packages failing to build from source has been identified by Chris Lamb, Chris West, and Niko Tyni. Chris Lamb was the lucky reporter of bug #800000 on vdr-plugin-prefermenu. Issues related to disorderfs are being tracked with a new issue.

26 July 2015

Lunar: Reproducible builds: week 12 in Stretch cycle

What happened in the reproducible builds effort this week: Toolchain fixes Eric Dorlan uploaded automake-1.15/1:1.15-2 which makes the output of mdate-sh deterministic. Original patch by Reiner Herrmann. Kenneth J. Pronovici uploaded epydoc/3.0.1+dfsg-8 which now honors SOURCE_DATE_EPOCH. Original patch by Reiner Herrmann. Chris Lamb submitted a patch to dh-python to make the order of the generated maintainer scripts deterministic. Chris also offered a fix for a source of non-determinism in dpkg-shlibdeps when packages have alternative dependencies. Dhole provided a patch to add support for SOURCE_DATE_EPOCH to gettext. Packages fixed The following 78 packages became reproducible in our setup due to changes in their build dependencies: chemical-mime-data, clojure-contrib, cobertura-maven-plugin, cpm, davical, debian-security-support, dfc, diction, dvdwizard, galternatives, gentlyweb-utils, gifticlib, gmtkbabel, gnuplot-mode, gplanarity, gpodder, gtg-trace, gyoto, highlight.js, htp, ibus-table, impressive, jags, jansi-native, jnr-constants, jthread, jwm, khronos-api, latex-coffee-stains, latex-make, latex2rtf, latexdiff, libcrcutil, libdc0, libdc1394-22, libidn2-0, libint, libjava-jdbc-clojure, libkryo-java, libphone-ui-shr, libpicocontainer-java, libraw1394, librostlab-blast, librostlab, libshevek, libstxxl, libtools-logging-clojure, libtools-macro-clojure, litl, londonlaw, ltsp, macsyfinder, mapnik, maven-compiler-plugin, mc, microdc2, miniupnpd, monajat, navit, pdmenu, pirl, plm, scikit-learn, snp-sites, sra-sdk, sunpinyin, tilda, vdr-plugin-dvd, vdr-plugin-epgsearch, vdr-plugin-remote, vdr-plugin-spider, vdr-plugin-streamdev, vdr-plugin-sudoku, vdr-plugin-xineliboutput, veromix, voxbo, xaos, xbae. The following packages became reproducible after getting fixed: Some uploads fixed some reproducibility issues but not all of them: Patches submitted which have not made their way to the archive yet: reproducible.debian.net The statistics on the main page of reproducible.debian.net are now updated every five minutes. A random unreviewed package is suggested in the look at a package form on every build. (h01ger) A new package set based new on the Core Internet Infrastructure census has been added. (h01ger) Testing of FreeBSD has started, though no results yet. More details have been posted to the freebsd-hackers mailing list. The build is run on a new virtual machine running FreeBSD 10.1 with 3 cores and 6 GB of RAM, also sponsored by Profitbricks. strip-nondeterminism development Andrew Ayer released version 0.009 of strip-nondeterminism. The new version will strip locales from Javadoc, include the name of files causing errors, and ignore unhandled (but rare) zip64 archives. debbindiff development Lunar continued its major refactoring to enhance code reuse and pave the way to fuzzy-matching and parallel processing. Most file comparators have now been converted to the new class hierarchy. In order to support for archive formats, work has started on packaging Python bindings for libarchive. While getting support for more archive formats with a common interface is very nice, libarchive is a stream oriented library and might have bad performance with how debbindiff currently works. Time will tell if better solutions need to be found. Documentation update Lunar started a Reproducible builds HOWTO intended to explain the different aspects of making software build reproducibly to the different audiences that might have to get involved like software authors, producers of binary packages, and distributors. Package reviews 17 obsolete reviews have been removed, 212 added and 46 updated this week. 15 new bugs for packages failing to build from sources have been reported by Chris West (Faux), and Mattia Rizzolo. Presentations Lunar presented Debian efforts and some recipes on making software build reproducibly at Libre Software Meeting 2015. Slides and a video recording are available. Misc. h01ger, dkg, and Lunar attended a Core Infrastructure Initiative meeting. The progress and tools mode for the Debian efforts were shown. Several discussions also helped getting a better understanding of the needs of other free software projects regarding reproducible builds. The idea of a global append only log, similar to the logs used for Certificate Transparency, came up on multiple occasions. Using such append only logs for keeping records of sources and build results has gotten the name Binary Transparency Logs . They would at least help identifying a compromised software signing key. Whether the benefits in using such logs justify the costs need more research.

12 July 2015

Lunar: Reproducible builds: week 11 in Stretch cycle

Debian is undertaking a huge effort to develop a reproducible builds system. I'd like to thank you for that. This could be Debian's most important project, with how badly computer security has been going.

PerniciousPunk in Reddit's Ask me anything! to Neil McGovern, DPL. What happened in the reproducible builds effort this week: Toolchain fixes More tools are getting patched to use the value of the SOURCE_DATE_EPOCH environment variable as the current time:

In the reproducible experimental toolchain which have been uploaded: Johannes Schauer followed up on making sbuild build path deterministic with several ideas. Packages fixed The following 311 packages became reproducible due to changes in their build dependencies : 4ti2, alot, angband, appstream-glib, argvalidate, armada-backlight, ascii, ask, astroquery, atheist, aubio, autorevision, awesome-extra, bibtool, boot-info-script, bpython, brian, btrfs-tools, bugs-everywhere, capnproto, cbm, ccfits, cddlib, cflow, cfourcc, cgit, chaussette, checkbox-ng, cinnamon-settings-daemon, clfswm, clipper, compton, cppcheck, crmsh, cupt, cutechess, d-itg, dahdi-tools, dapl, darnwdl, dbusada, debian-security-support, debomatic, dime, dipy, dnsruby, doctrine, drmips, dsc-statistics, dune-common, dune-istl, dune-localfunctions, easytag, ent, epr-api, esajpip, eyed3, fastjet, fatresize, fflas-ffpack, flann, flex, flint, fltk1.3, fonts-dustin, fonts-play, fonts-uralic, freecontact, freedoom, gap-guava, gap-scscp, genometools, geogebra, git-reintegrate, git-remote-bzr, git-remote-hg, gitmagic, givaro, gnash, gocr, gorm.app, gprbuild, grapefruit, greed, gtkspellmm, gummiboot, gyp, heat-cfntools, herold, htp, httpfs2, i3status, imagetooth, imapcopy, imaprowl, irker, jansson, jmapviewer, jsdoc-toolkit, jwm, katarakt, khronos-opencl-man, khronos-opengl-man4, lastpass-cli, lava-coordinator, lava-tool, lavapdu, letterize, lhapdf, libam7xxx, libburn, libccrtp, libclaw, libcommoncpp2, libdaemon, libdbusmenu-qt, libdc0, libevhtp, libexosip2, libfreenect, libgwenhywfar, libhmsbeagle, libitpp, libldm, libmodbus, libmtp, libmwaw, libnfo, libpam-abl, libphysfs, libplayer, libqb, libsecret, libserial, libsidplayfp, libtime-y2038-perl, libxr, lift, linbox, linthesia, livestreamer, lizardfs, lmdb, log4c, logbook, lrslib, lvtk, m-tx, mailman-api, matroxset, miniupnpd, mknbi, monkeysign, mpi4py, mpmath, mpqc, mpris-remote, musicbrainzngs, network-manager, nifticlib, obfsproxy, ogre-1.9, opal, openchange, opensc, packaging-tutorial, padevchooser, pajeng, paprefs, pavumeter, pcl, pdmenu, pepper, perroquet, pgrouting, pixz, pngcheck, po4a, powerline, probabel, profitbricks-client, prosody, pstreams, pyacidobasic, pyepr, pymilter, pytest, python-amqp, python-apt, python-carrot, python-django, python-ethtool, python-mock, python-odf, python-pathtools, python-pskc, python-psutil, python-pypump, python-repoze.tm2, python-repoze.what, qdjango, qpid-proton, qsapecng, radare2, reclass, repsnapper, resource-agents, rgain, rttool, ruby-aggregate, ruby-albino, ruby-archive-tar-minitar, ruby-bcat, ruby-blankslate, ruby-coffee-script, ruby-colored, ruby-dbd-mysql, ruby-dbd-odbc, ruby-dbd-pg, ruby-dbd-sqlite3, ruby-dbi, ruby-dirty-memoize, ruby-encryptor, ruby-erubis, ruby-fast-xs, ruby-fusefs, ruby-gd, ruby-git, ruby-globalhotkeys, ruby-god, ruby-hike, ruby-hmac, ruby-integration, ruby-jnunemaker-matchy, ruby-memoize, ruby-merb-core, ruby-merb-haml, ruby-merb-helpers, ruby-metaid, ruby-mina, ruby-net-irc, ruby-net-netrc, ruby-odbc, ruby-ole, ruby-packet, ruby-parseconfig, ruby-platform, ruby-plist, ruby-popen4, ruby-rchardet, ruby-romkan, ruby-ronn, ruby-rubyforge, ruby-rubytorrent, ruby-samuel, ruby-shoulda-matchers, ruby-sourcify, ruby-test-spec, ruby-validatable, ruby-wirble, ruby-xml-simple, ruby-zoom, rumor, rurple-ng, ryu, sam2p, scikit-learn, serd, shellex, shorewall-doc, shunit2, simbody, simplejson, smcroute, soqt, sord, spacezero, spamassassin-heatu, spamprobe, sphinxcontrib-youtube, splitpatch, sratom, stompserver, syncevolution, tgt, ticgit, tinyproxy, tor, tox, transmissionrpc, tweeper, udpcast, units-filter, viennacl, visp, vite, vmfs-tools, waffle, waitress, wavtool-pl, webkit2pdf, wfmath, wit, wreport, x11proto-input, xbae, xdg-utils, xdotool, xsystem35, yapsy, yaz. Please note that some packages in the above list are falsely reproducible. In the experimental toolchain, debhelper exported TZ=UTC and this made packages capturing the current date (without the time) reproducible in the current test environment. The following packages became reproducible after getting fixed: Ben Hutchings upstreamed several patches to fix Linux reproducibility issues which were quickly merged. Some uploads fixed some reproducibility issues but not all of them: Uploads that should fix packages not in main: Patches submitted which have not made their way to the archive yet: reproducible.debian.net A new package set has been added for lua maintainers. (h01ger) tracker.debian.org now only shows reproducibility issues for unstable. Holger and Mattia worked on several bugfixes and enhancements: finished initial test setup for NetBSD, rewriting more shell scripts in Python, saving UDD requests, and more debbindiff development Reiner Herrmann fixed text comparison of files with different encoding. Documentation update Juan Picca added to the commands needed for a local test chroot installation of the locales-all package. Package reviews 286 obsolete reviews have been removed, 278 added and 243 updated this week. 43 new bugs for packages failing to build from sources have been filled by Chris West (Faux), Mattia Rizzolo, and h01ger. The following new issues have been added: timestamps_in_manpages_generated_by_ronn, timestamps_in_documentation_generated_by_org_mode, and timestamps_in_pdf_generated_by_matplotlib. Misc. Reiner Herrmann has submitted patches for OpenWrt. Chris Lamb cleaned up some code and removed cruft in the misc.git repository. Mattia Rizzolo updated the prebuilder script to match what is currently done on reproducible.debian.net.

29 June 2015

Lunar: Reproducible builds: week 9 in Stretch cycle

What happened about the reproducible builds effort this week: Toolchain fixes Norbert Preining uploaded texinfo/6.0.0.dfsg.1-2 which makes texinfo indices reproducible. Original patch by Chris Lamb. Lunar submitted recently rebased patches to make the file order of files inside .deb stable. akira filled #789843 to make tex4ht stop printing timestamps in its HTML output by default. Dhole wrote a patch for xutils-dev to prevent timestamps when creating gzip compresed files. Reiner Herrmann sent a follow-up patch for wheel to use UTC as timezone when outputing timestamps. Mattia Rizzolo started a discussion regarding the failure to build from source of subversion when -Wdate-time is added to CPPFLAGS which happens when asking dpkg-buildflags to use the reproducible profile. SWIG errors out because it doesn't recognize the aforementioned flag. Trying to get the .buildinfo specification to more definitive state, Lunar started a discussion on storing the checksums of the binary package used in dpkg status database. akira discovered while proposing a fix for simgrid that CMake internal command to create tarballs would record a timestamp in the gzip header. A way to prevent it is to use the GZIP environment variable to ask gzip not to store timestamps, but this will soon become unsupported. It's up for discussion if the best place to fix the problem would be to fix it for all CMake users at once. Infrastructure-related work Andreas Henriksson did a delayed NMU upload of pbuilder which adds minimal support for build profiles and includes several fixes from Mattia Rizzolo affecting reproducibility tests. Neils Thykier uploaded lintian which both raises the severity of package-contains-timestamped-gzip and avoids false positives for this tag (thanks to Tomasz Buchert). Petter Reinholdtsen filled #789761 suggesting that how-can-i-help should prompt its users about fixing reproducibility issues. Packages fixed The following packages became reproducible due to changes in their build dependencies: autorun4linuxcd, libwildmagic, lifelines, plexus-i18n, texlive-base, texlive-extra, texlive-lang. The following packages became reproducible after getting fixed: Some uploads fixed some reproducibility issues but not all of them: Untested uploaded as they are not in main: Patches submitted which have not made their way to the archive yet: debbindiff development debbindiff/23 includes a few bugfixes by Helmut Grohne that result in a significant speedup (especially on larger files). It used to exhibit the quadratic time string concatenation antipattern. Version 24 was released on June 23rd in a hurry to fix an undefined variable introduced in the previous version. (Reiner Herrmann) debbindiff now has a test suite! It is written using the PyTest framework (thanks Isis Lovecruft for the suggestion). The current focus has been on the comparators, and we are now at 93% of code coverage for these modules. Several problems were identified and fixed in the process: paths appearing in output of javap, readelf, objdump, zipinfo, unsqusahfs; useless MD5 checksum and last modified date in javap output; bad handling of charsets in PO files; the destination path for gzip compressed files not ending in .gz; only metadata of cpio archives were actually compared. stat output was further trimmed to make directory comparison more useful. Having the test suite enabled a refactoring of how comparators were written, switching from a forest of differences to a single tree. This helped removing dust from the oldest parts of the code. Together with some other small changes, version 25 was released on June 27th. A follow up release was made the next day to fix a hole in the test suite and the resulting unidentified leftover from the comparator refactoring. (Lunar) Documentation update Ximin Luo improved code examples for some proposed environment variables for reference timestamps. Dhole added an example on how to fix timestamps C pre-processor macros by adding a way to set the build date externally. akira documented her fix for tex4ht timestamps. Package reviews 94 obsolete reviews have been removed, 330 added and 153 updated this week. Hats off for Chris West (Faux) who investigated many fail to build from source issues and reported the relevant bugs. Slight improvements were made to the scripts for editing the review database, edit-notes and clean-notes. (Mattia Rizzolo) Meetings A meeting was held on June 23rd. Minutes are available. The next meeting will happen on Tuesday 2015-07-07 at 17:00 UTC. Misc. The Linux Foundation announced that it was funding the work of Lunar and h01ger on reproducible builds in Debian and other distributions. This was further relayed in a Bits from Debian blog post.

22 June 2015

Lunar: Reproducible builds: week 8 in Stretch cycle

What happened about the reproducible builds effort this week: Toolchain fixes Andreas Henriksson has improved Johannes Schauer initial patch for pbuilder adding support for build profiles. Packages fixed The following 12 packages became reproducible due to changes in their build dependencies: collabtive, eric, file-rc, form-history-control, freehep-chartableconverter-plugin , jenkins-winstone, junit, librelaxng-datatype-java, libwildmagic, lightbeam, puppet-lint, tabble. The following packages became reproducible after getting fixed: Some uploads fixed some reproducibility issues but not all of them: Patches submitted which have not made their way to the archive yet: reproducible.debian.net Bugs with the ftbfs usertag are now visible on the bug graphs. This explain the recent spike. (h01ger) Andreas Beckmann suggested a way to test building packages using the funny paths that one can get when they contain the full Debian package version string. debbindiff development Lunar started an important refactoring introducing abstactions for containers and files in order to make file type identification more flexible, enabling fuzzy matching, and allowing parallel processing. Documentation update Ximin Luo detailed the proposal to standardize environment variables to pass a reference source date to tools that needs one (e.g. documentation generator). Package reviews 41 obsolete reviews have been removed, 168 added and 36 updated this week. Some more issues affecting packages failing to build from source have been identified. Meetings Minutes have been posted for Tuesday June 16th meeting. The next meeting is scheduled Tuesday June 23rd at 17:00 UTC. Presentations Lunar presented the project in French during Pas Sage en Seine in Paris. Video and slides are available.

17 May 2015

Lunar: Reproducible builds: week 3 in Stretch cycle

What happened about the reproducible builds effort for this week: Toolchain fixes Tomasz Buchert submitted a patch to fix the currently overzealous package-contains-timestamped-gzip warning. Daniel Kahn Gillmor identified #588746 as a source of unreproducibility for packages using python-support. Packages fixed The following 57 packages became reproducible due to changes in their build dependencies: antlr-maven-plugin, aspectj-maven-plugin, build-helper-maven-plugin, clirr-maven-plugin, clojure-maven-plugin, cobertura-maven-plugin, coinor-ipopt, disruptor, doxia-maven-plugin, exec-maven-plugin, gcc-arm-none-eabi, greekocr4gamera, haskell-swish, jarjar-maven-plugin, javacc-maven-plugin, jetty8, latexml, libcgi-application-perl, libnet-ssleay-perl, libtest-yaml-valid-perl, libwiki-toolkit-perl, libwww-csrf-perl, mate-menu, maven-antrun-extended-plugin, maven-antrun-plugin, maven-archiver, maven-bundle-plugin, maven-clean-plugin, maven-compiler-plugin, maven-ear-plugin, maven-install-plugin, maven-invoker-plugin, maven-jar-plugin, maven-javadoc-plugin, maven-processor-plugin, maven-project-info-reports-plugin, maven-replacer-plugin, maven-resources-plugin, maven-shade-plugin, maven-site-plugin, maven-source-plugin, maven-stapler-plugin, modello-maven-plugin1.4, modello-maven-plugin, munge-maven-plugin, ocaml-bitstring, ocr4gamera, plexus-maven-plugin, properties-maven-plugin, ruby-magic, ruby-mocha, sisu-maven-plugin, syncache, vdk2, wvstreams, xml-maven-plugin, xmlbeans-maven-plugin. The following packages became reproducible after getting fixed: Some uploads fixed some reproducibility issues but not all of them: Ben Hutchings also improved and merged several changes submitted by Lunar to linux. Currently untested because in contrib: reproducible.debian.net
Thanks to the reproducible-build team for running a buildd from hell. gregor herrmann
Mattia Rizzolo modified the script added last week to reschedule a package from Alioth, a reason can now be optionally specified. Holger Levsen splitted the package sets page so each set now has its own page. He also added new sets for Java packages, Haskell packages, Ruby packages, debian-installer packages, Go packages, and OCaml packages. Reiner Herrmann added locales-all to the set of packages installed in the build environment as its needed to properly identify variations due to the current locale. Holger Levsen improved the scheduling so new uploads get tested sooner. He also changed the .json output that is used by tracker.debian.org to lists FTBFS issues again but only for issues unrelated to the toolchain or our test setup. Amongst many other small fixes and additions, the graph colors should now be more friendly to red-colorblind people. The fix for pbuilder given in #677666 by Tim Landscheidt is now used. This fixed several FTBFS for OCaml packages. Work on rebuilding with different CPU has continued, a kvm-on-kvm build host has been set been set up for this purpose. debbindiff development Version 19 of debbindiff included a fix for a regression when handling info files. Version 20 fixes a bug when diffing files with many differences toward a last line with no newlines. It also now uses the proper encoding when writing the text output to a pipe, and detects info files better. Documentation update Thanks to Santiago Vila, the unneeded -depth option used with find when fixing mtimes has been removed from the examples. Package reviews 113 obsolete reviews have been removed this week while 77 has been added.

4 May 2015

Lunar: Reproducible builds: first week in Stretch cycle

Debian Jessie has been released on April 25th, 2015. This has opened the Stretch development cycle. Reactions to the idea of making Debian build reproducibly have been pretty enthusiastic. As the pace is now likely to be even faster, let's see if we can keep everyone up-to-date on the developments. Before the release of Jessie The story goes back a long way but a formal announcement to the project has only been sent in February 2015. Since then, too much work has happened to make a complete report, but to give some highlights: Lunar did a pretty improvised lightning talk during the Mini-DebConf in Lyon. This past week It seems changes were pilling behind the curtains given the amount of activity that happened in just one week. Toolchain fixes We also rebased the experimental version of debhelper twice to merge the latest set of changes. Lunar submitted a patch to add a -creation-date to genisoimage. Reiner Herrmann opened #783938 to request making -notimestamp the default behavior for javadoc. Juan Picca submitted a patch to add a --use-date flag to texi2html. Packages fixed The following packages became reproducible due to changes of their build dependencies: apport, batctl, cil, commons-math3, devscripts, disruptor, ehcache, ftphs, gtk2hs-buildtools, haskell-abstract-deque, haskell-abstract-par, haskell-acid-state, haskell-adjunctions, haskell-aeson, haskell-aeson-pretty, haskell-alut, haskell-ansi-terminal, haskell-async, haskell-attoparsec, haskell-augeas, haskell-auto-update, haskell-binary-conduit, haskell-hscurses, jsch, ledgersmb, libapache2-mod-auth-mellon, libarchive-tar-wrapper-perl, libbusiness-onlinepayment-payflowpro-perl, libcapture-tiny-perl, libchi-perl, libcommons-codec-java, libconfig-model-itself-perl, libconfig-model-tester-perl, libcpan-perl-releases-perl, libcrypt-unixcrypt-perl, libdatetime-timezone-perl, libdbd-firebird-perl, libdbix-class-resultset-recursiveupdate-perl, libdbix-profile-perl, libdevel-cover-perl, libdevel-ptkdb-perl, libfile-tail-perl, libfinance-quote-perl, libformat-human-bytes-perl, libgtk2-perl, libhibernate-validator-java, libimage-exiftool-perl, libjson-perl, liblinux-prctl-perl, liblog-any-perl, libmail-imapclient-perl, libmocked-perl, libmodule-build-xsutil-perl, libmodule-extractuse-perl, libmodule-signature-perl, libmoosex-simpleconfig-perl, libmoox-handlesvia-perl, libnet-frame-layer-ipv6-perl, libnet-openssh-perl, libnumber-format-perl, libobject-id-perl, libpackage-pkg-perl, libpdf-fdf-simple-perl, libpod-webserver-perl, libpoe-component-pubsub-perl, libregexp-grammars-perl, libreply-perl, libscalar-defer-perl, libsereal-encoder-perl, libspreadsheet-read-perl, libspring-java, libsql-abstract-more-perl, libsvn-class-perl, libtemplate-plugin-gravatar-perl, libterm-progressbar-perl, libterm-shellui-perl, libtest-dir-perl, libtest-log4perl-perl, libtext-context-eitherside-perl, libtime-warp-perl, libtree-simple-perl, libwww-shorten-simple-perl, libwx-perl-processstream-perl, libxml-filter-xslt-perl, libxml-writer-string-perl, libyaml-tiny-perl, mupen64plus-core, nmap, openssl, pkg-perl-tools, quodlibet, r-cran-rjags, r-cran-rjson, r-cran-sn, r-cran-statmod, ruby-nokogiri, sezpoz, skksearch, slurm-llnl, stellarium. The following packages became reproducible after getting fixed: Some uploads fixed some reproducibility issues but not all of them: Patches submitted which did not make their way to the archive yet: Improvements to reproducible.debian.net Mattia Rizzolo has been working on compressing logs using gzip to save disk space. The web server would uncompress them on-the-fly for clients which does not accept gzip content. Mattia Rizzolo worked on a new page listing various breakage: missing or bad debbindiff output, missing build logs, unavailable build dependencies. Holger Levsen added a new execution environment to run debbindiff using dependencies from testing. This is required for packages built with GHC as the compiler only understands interfaces built by the same version. debbindiff development Version 17 has been uploaded to unstable. It now supports comparing ISO9660 images, dictzip files and should compare identical files much faster. Documentation update Various small updates and fixes to the pages about PDF produced by LaTeX, DVI produced by LaTeX, static libraries, Javadoc, PE binaries, and Epydoc. Package reviews Known issues have been tagged when known to be deterministic as some might unfortunately not show up on every single build. For example, two new issues have been identified by building with one timezone in April and one in May. RD and help2man add current month and year to the documentation they are producing. 1162 packages have been removed and 774 have been added in the past week. Most of them are the work of proper automated investigation done by Chris West. Summer of code Finally, we learned that both akira and Dhole were accepted for this Google Summer of Code. Let's welcome them! They have until May 25th before coding officialy begins. Now is the good time to help them feel more comfortable by sharing all these little bits of knowledge on how Debian works.

18 April 2015

Gregor Herrmann: RC bugs 2015/11-16

only one week left until the jessie release. yay! in the last weeks I didn't find many RC bugs that I could fix; still, here's the short list; nice feature: I mostly help others or could build an work done by others.

3 April 2015

Thorsten Alteholz: My Debian Activities in March 2015

FTP assistant Recently the NEW queue grew due to lots of uploads of new KDE software and several smaller node-packages. The KDE-stuff will be processed one after another, but the node-stuff seems to be rather strange. After the last discussion I was told that all those small packages can be accumulated into bigger chunks. I hope this discussion doesn t need to be repeated again Anyway, this month I marked 117 packages for accept and rejected 51 packages. Squeeze LTS This was my ninth month that I did some work for the Squeeze LTS initiative, started by Raphael Hertzog at Freexian. This month I got assigned a workload of 15.25h and I spent these hours to upload new versions of: Finally I was also able to upload the binutils package. Up to now, I got no complaints that something is not working anymore, so yeah, I seem to make it. The next big adventure will be a new upload of PHP. I already started with some patches, but it is still a good piece of work. I also uploaded update for DLA 164-1] unace security update, [DLA 168-1] konversation security update and [DLA 172-1] libextlib-ruby security update although no LTS sponsor indicated any interest. Other packages This month the severity of one bug in greylistd had been raised from normal to severe and such I had to upload a new version. Thanks to Andreas Beckmann for raising and for providing a patch. I also uploaded a new version of dict-elements and closed a bug related to reproducible builds. As I am the maintainer of libkeepalive, I got an email from Andreas Florath. He wanted to persuade me to create a package for his library libdontdie, which is rather similar to libkeepalive but has some improvements. As I promised to do some more packaging work, he didn t have to argue much and voila, there now is a new package libdontdie available. As the cooperation with him is really pleasant, I also created a package for his other project: pipexec. Donations Thanks alot to all donors, this month I got 30 in total. I really appreciate this and hope that everybody is pleased with my commitment. Don t hesitate to make suggestions for improvements.

2 February 2015

Thorsten Alteholz: My Debian Activities in January 2015

FTP assistant This month at the beginning of the year has been rather quiet as well. All in all I marked 50 packages for accept and rejected only 17 packages. Squeeze LTS This was my seventh month that I did some work for the Squeeze LTS initiative, started by Raphael Hertzog at Freexian. This month I got assigned a workload of 12h and I spent these hours to upload new versions of: In doing so, preparing the upload for php5 consumed most of the time as support from Upstream for the old version in Squeeze no longer exists. Oddly enough, a simple one-line-patch seems to have created a regression I also sponsored the upload of [DLA 133-1] unrtf security update, [DLA 134-1] curl security update and [DLA 130-1] firebird2.1 security update. Many thanks to Nguyen Cong from Toshiba who prepared the patches for these packages. I also uploaded two DLAs for polarssel ([DLA 129-1] polarssl security update and [DLA 144-1] polarssl security update) although no LTS sponsor indicated any interest. Other packages Thanks to the relentless QA work of Andreas Beckmann, his piuparts tests detected an issue in the greylistd package. If greylistd has been installed in Wheezy, removed but not purged afterwards, the whole system dist-upgraded to Jessie and afterwards greylistd is installed again, there would be an error message. RC bug taken, fixed package uploaded and unblock request approved.

20 April 2013

Ulrich Dangel: Analyzing rc bug messages

Michael Stapelberg recently posted a blog post about looking into the number of Debian Developers actively working on RC bugs for the upcoming wheezy release. In this blog post I analyze the data shared by Michael and provide the R commands used to generate the plots & findings. If you are interested into looking into the data yourself, but don t like R, I suggest using ipython notebook + numpy instead.

Analysis After parsing the data file we typically want to get an understanding of the data, by using summary(bugs) we get the minimum(1), median(5), mean(15.4), max(716) and quantiles of the data. This shows that the number of messages is wide-spread and a few people contribute a lot. To visualize the dispersion of the data we can create a box plot showing the range of messages: boxplot As the first and third quantile are close together we can assume that the majority of the work is done by a few, especially since the second quantile is 5. This is supported by the histogram below, where the x axis is the number of recorded messages and y is the number of developers. histogram

Top 10 contributors The TOP 10 contributors, according to the dataset, are:
  1. Lucas Nussbaum - 716 messages
  2. Gregor Herrmann - 270 messages
  3. Jakub Wilk - 270 messages
  4. Andreas Beckmann - 225 messages
  5. Julien Cristau - 205 messages
  6. Cyril Brulebois - 169 messages
  7. Moritz Muehlenhoff - 162 messages
  8. Michael Biebl - 159 messages
  9. Salvatore Bonaccorso - 158 messages
  10. Christoph Egger - 142 messages

r commands These are the commands used to generate the plots and information in this plot:
bugs <- read.csv("by-msg.csv")
summary(bugs)
boxplot(bugs$rcbugmsg, log='y', range=0, ylab="# bugs")
quantile(bugs$rcbugmsg)
0%  25%  50%  75% 100%
1    2    5   12  716
# create histogram
llibrary('ggplot2')
ggplot(bugs, aes(x=rcbugmsg)) + geom_histogram(binwidth=.5, colour="black", fill="black") + scale_x_sqrt()
top10 <- tail(bugs[order(bugs$rcbugmsg),], 10)
top10

Ulrich Dangel: Analyzing rc bug messages

Michael Stapelberg recently posted a blog post about looking into the number of Debian Developers actively working on RC bugs for the upcoming wheezy release. In this blog post I analyze the data shared by Michael and provide the R commands used to generate the plots & findings. If you are interested into looking into the data yourself, but don t like R, I suggest using ipython notebook + numpy instead.

Analysis After parsing the data file we typically want to get an understanding of the data, by using summary(bugs) we get the minimum(1), median(5), mean(15.4), max(716) and quantiles of the data. This shows that the number of messages is wide-spread and a few people contribute a lot. To visualize the dispersion of the data we can create a box plot showing the range of messages: boxplot As the first and third quantile are close together we can assume that the majority of the work is done by a few, especially since the second quantile is 5. This is supported by the histogram below, where the x axis is the number of recorded messages and y is the number of developers. histogram

Top 10 contributors The TOP 10 contributors, according to the dataset, are:
  1. Lucas Nussbaum - 716 messages
  2. Gregor Herrmann - 270 messages
  3. Jakub Wilk - 270 messages
  4. Andreas Beckmann - 225 messages
  5. Julien Cristau - 205 messages
  6. Cyril Brulebois - 169 messages
  7. Moritz Muehlenhoff - 162 messages
  8. Michael Biebl - 159 messages
  9. Salvatore Bonaccorso - 158 messages
  10. Christoph Egger - 142 messages

r commands These are the commands used to generate the plots and information in this plot:
bugs <- read.csv("by-msg.csv")
summary(bugs)
boxplot(bugs$rcbugmsg, log='y', range=0, ylab="# bugs")
quantile(bugs$rcbugmsg)
0%  25%  50%  75% 100%
1    2    5   12  716
# create histogram
llibrary('ggplot2')
ggplot(bugs, aes(x=rcbugmsg)) + geom_histogram(binwidth=.5, colour="black", fill="black") + scale_x_sqrt()
top10 <- tail(bugs[order(bugs$rcbugmsg),], 10)
top10

Next.