Search Results: "Don Armstrong"

29 March 2021

Benjamin Mako Hill: Identifying Underproduced Software

I wrote this blog post with Kaylea Champion and a version of this post was originally posted on the Community Data Science Collective blog. Critical software we all rely on can silently crumble away beneath us. Unfortunately, we often don t find out software infrastructure is in poor condition until it is too late. Over the last year or so, I have been supporting Kaylea Champion on a project my group announced earlier to measure software underproduction a term we use to describe software that is low in quality but high in importance. Underproduction reflects an important type of risk in widely used free/libre open source software (FLOSS) because participants often choose their own projects and tasks. Because FLOSS contributors work as volunteers and choose what they work on, important projects aren t always the ones to which FLOSS developers devote the most attention. Even when developers want to work on important projects, relative neglect among important projects is often difficult for FLOSS contributors to see. Given all this, what can we do to detect problems in FLOSS infrastructure before major failures occur? Kaylea Champion and I recently published a paper laying out our new method for measuring underproduction at the IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER) 2021 that we believe provides one important answer to this question.

A conceptual diagram of underproduction. The x-axis shows relative importance, the y-axis relative quality. The top left area of the graph described by these axes is 'overproduction' -- high quality, low importance. The diagonal is Alignment: quality and importance are approximately the same. The lower right depicts underproduction -- high importance, low quality -- the area of potential risk.Conceptual diagram showing how our conception of underproduction relates to quality and importance of software.
In the paper, we describe a general approach for detecting underproduced software infrastructure that consists of five steps: (1) identifying a body of digital infrastructure (like a code repository); (2) identifying a measure of quality (like the time to takes to fix bugs); (3) identifying a measure of importance (like install base); (4) specifying a hypothesized relationship linking quality and importance if quality and importance are in perfect alignment; and (5) quantifying deviation from this theoretical baseline to find relative underproduction. To show how our method works in practice, we applied the technique to an important collection of FLOSS infrastructure: 21,902 packages in the Debian GNU/Linux distribution. Although there are many ways to measure quality, we used a measure of how quickly Debian maintainers have historically dealt with 461,656 bugs that have been filed over the last three decades. To measure importance, we used data from Debian s Popularity Contest opt-in survey. After some statistical machinations that are documented in our paper, the result was an estimate of relative underproduction for the 21,902 packages in Debian we looked at. One of our key findings is that underproduction is very common in Debian. By our estimates, at least 4,327 packages in Debian are underproduced. As you can see in the list of the most underproduced packages again, as estimated using just one more measure many of the most at risk packages are associated with the desktop and windowing environments where there are many users but also many extremely tricky integration-related bugs.
This table shows the 30 packages with the most severe underproduction problem in Debian, shown as a series of boxplots.These 30 packages have the highest level of underproduction in Debian according to our analysis.
We hope these results are useful to folks at Debian and the Debian QA team. We also hope that the basic method we ve laid out is something that others will build off in other contexts and apply to other software repositories.
In addition to the paper itself and the video of the conference presentation on Youtube by Kaylea, we ve put a repository with all our code and data in an archival repository Harvard Dataverse and we d love to work with others interested in applying our approach in other software ecosytems.

For more details, check out the full paper which is available as a freely accessible preprint.

This project was supported by the Ford/Sloan Digital Infrastructure Initiative. Wm Salt Hale of the Community Data Science Collective and Debian Developers Paul Wise and Don Armstrong provided valuable assistance in accessing and interpreting Debian bug data. Ren Just generously provided insight and feedback on the manuscript.

Paper Citation: Kaylea Champion and Benjamin Mako Hill. 2021. Underproduction: An Approach for Measuring Risk in Open Source Software. In Proceedings of the IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER 2021). IEEE.

Contact Kaylea Champion ( with any questions or if you are interested in following up.

7 November 2017

Don Armstrong: Autorandr: automatically adjust screen layout

Like many laptop users, I often plug my laptop into different monitor setups (multiple monitors at my desk, projector when presenting, etc.) Running xrandr commands or clicking through interfaces gets tedious, and writing scripts isn't much better. Recently, I ran across autorandr, which detects attached monitors using EDID (and other settings), saves xrandr configurations, and restores them. It can also run arbitrary scripts when a particular configuration is loaded. I've packed it, and it is currently waiting in NEW. If you can't wait, the deb is here and the git repo is here. To use it, simply install the package, and create your initial configuration (in my case, undocked):
 autorandr --save undocked
then, dock your laptop (or plug in your external monitor(s)), change the configuration using xrandr (or whatever you use), and save your new configuration (in my case, workstation):
autorandr --save workstation
repeat for any additional configurations you have (or as you find new configurations). Autorandr has udev, systemd, and pm-utils hooks, and autorandr --change should be run any time that new displays appear. You can also run autorandr --change or autorandr --load workstation manually too if you need to. You can also add your own ~/.config/autorandr/$PROFILE/postswitch script to run after a configuration is loaded. Since I run i3, my workstation configuration looks like this:
 xrandr --dpi 92
 xrandr --output DP2-2 --primary
 i3-msg '[workspace="^(1 4 6)"] move workspace to output DP2-2;'
 i3-msg '[workspace="^(2 5 9)"] move workspace to output DP2-3;'
 i3-msg '[workspace="^(3 8)"] move workspace to output DP2-1;'
which fixes the dpi appropriately, sets the primary screen (possibly not needed?), and moves the i3 workspaces about. You can also arrange for configurations to never be run by adding a block hook in the profile directory. Check it out if you change your monitor configuration regularly!

10 August 2017

Don Armstrong: Debbugs: 22 Years of Bugs (Debconf 2017)

This is a talk which I presented on August 10th, 2017 at Debconf 17 in Montreal, Canada.

20 September 2016

Gunnar Wolf: Proposing a GR to repeal the 2005 vote for declassification of the debian-private mailing list

For the non-Debian people among my readers: The following post presents bits of the decision-taking process in the Debian project. You might find it interesting, or terribly dull and boring :-) Proceed at your own risk. My reason for posting this entry is to get more people to read the accompanying options for my proposed General Resolution (GR), and have as full a ballot as possible. Almost three weeks ago, I sent a mail to the debian-vote mailing list. I'm quoting it here in full:
Some weeks ago, Nicolas Dandrimont proposed a GR for declassifying
debian-private[1]. In the course of the following discussion, he
accepted[2] Don Armstrong's amendment[3], which intended to clarify the
meaning and implementation regarding the work of our delegates and the
powers of the DPL, and recognizing the historical value that could lie
within said list.
In the process of the discussion, several people objected to the
amended wording, particularly to the fact that "sufficient time and
opportunity" might not be sufficiently bound and defined.
I am, as some of its initial seconders, a strong believer in Nicolas'
original proposal; repealing a GR that was never implemented in the
slightest way basically means the Debian project should stop lying,
both to itself and to the whole free software community within which
it exists, about something that would be nice but is effectively not
While Don's proposal is a good contribution, given that in the
aforementioned GR "Further Discussion" won 134 votes against 118, I
hereby propose the following General Resolution:
Title: Acknowledge that the debian-private list will remain private.
1. The 2005 General Resolution titled "Declassification of debian-private
   list archives" is repealed.
2. In keeping with paragraph 3 of the Debian Social Contract, Debian
   Developers are strongly encouraged to use the debian-private mailing
   list only for discussions that should not be disclosed.
=== END GR TEXT ===
Thanks for your consideration,
Gunnar Wolf
(with thanks to Nicolas for writing the entirety of the GR text ;-) )
Yesterday, I spoke with the Debian project secretary, who confirmed my proposal has reached enough Seconds (that is, we have reached five people wanting the vote to happen), so I could now formally do a call for votes. Thing is, there are two other proposals I feel are interesting, and should be part of the same ballot, and both address part of the reasons why the GR initially proposed by Nicolas didn't succeed: So, once more (and finally!), why am I posting this? I plan to do the formal call for votes by Friday 23.
[update] Kurt informed me that the discussion period started yesterday, when I received the 5th second. The minimum discussion period is two weeks, so I will be doing a call for votes at or after 2016-10-03.

24 August 2016

Don Armstrong: H3ABioNet Hackathon (Workflows)

I'm in Pretoria, South Africa at the H3ABioNet hackathon which is developing workflows for Illumina chip genotyping, imputation, 16S rRNA sequencing, and population structure/association testing. Currently, I'm working with the imputation stream and we're using Nextflow to deploy an IMPUTE-based imputation workflow with Docker and NCSA's openstack-based cloud (Nebula) underneath. The OpenStack command line clients (nova and cinder) seem to be pretty usable to automate bringing up a fleet of VMs and the cloud-init package which is present in the images makes configuring the images pretty simple. Now if I just knew of a better shared object store which was supported by Nextflow in OpenStack besides mounting an NFS share, things would be better. You can follow our progress in our git repo: []

1 January 2016

Bdale Garbee: Term Limited

I woke up this morning and realized that for the first time since 17 April 2001, I am no longer a member of the Debian Technical Committee. My departure from the committee is a consequence of the Debian General Resolution "limiting the term of the technical committee members" that was passed amending the Debian Constitution nearly a year ago. As the two longest-serving members, both over the term limit, Steve Langasek and I completed our service yesterday. In early March 2015, I stepped down from the role of chairman after serving in that role for the better part of a decade, to help ensure a smooth transition. Don Armstrong is now serving admirably in that role, I have the utmost respect for the remaining members of the TC, and the process of nominating replacements for the two now-vacant seats is already well underway. So, for the Debian project as a whole, today is really a non-event... which is exactly as it should be! Debian has been a part of my life since 1994, and I sincerely hope to be able to remain involved for many years to come!

5 August 2015

Don Armstrong: Introducing dqsub

I've been using qsub for a while now on the cluster here at the IGB at UofI. qsub is a command line program which is used to submit jobs to a scheduler to eventually be run on one (or more) nodes of a cluster. Unfortunately, qsub's interface is horrible. It requires that you write a shell script for every single little thing you run, and doesn't do simple things like providing defaults or running multiple jobs at once with slightly different arguments. I've dealt with this for a while using some rudimentary shell scripting, but I finally had enough. So instead, I wrote a wrapper around qsub called dqsub. What used to require a complicated invocation like:
echo -e '#!/bin/bash\nmake foo'  \
 qsub -q default -S /bin/bash -d $(pwd) \
  -l mem=8G,nodes=1:ppn=4 -;
can now be run with
dqsub --mem 8G --ppn 4 make foo;
Want to run some command in every single directory which starts with SRX? That's easy:
ls -1 SRX* dqsub --mem 8G --ppn 4 --array chdir make bar;
Want instead to behave like xargs but do the same thing?
ls -1 SRX* dqsub --mem 8G --ppn 4 --array xargs make bar -C;
Now, this wrapper isn't complete yet, but it's already more than enough to do what I require, and has saved me quite a bit of time already. You can steal dqsub for yourself Feel free to request specific features, too.

28 July 2015

Norbert Preining: ePub editor Sigil landed in Debian

Long long time ago I wanted to have Sigil, an epub editor, to appear in Debian. There was a packaging wishlist bug from back in 2010 with intermittent activities. But thanks to concerted effort, especially by Mattia Rizzolo and Don Armstrong, packaging progressed to a state that I could sponsor the upload to experimental about 4 months ago. And yesterday, after long waiting, finally Sigil passed the watchful eyes of the Debian ftp-masters and has entered Debian/experimental. sigil-debian I have already updated the packaging for the latest version 0.8.7, which will be included in Debian/sid rather soon. Thanks again especially Mattia for his great work.
Email this to someonePrint this pageShare on FacebookShare on Google+Tweet about this on TwitterShare on LinkedInFlattr the author

15 November 2014

Don Armstrong: Adding a newcomer ( ) tag to the BTS

Some of you may already be aware of the gift tag which has been used for a while to indicate bugs which are suitable for new contributors to use as an entry point to working on specific packages. Unfortunately, some of us (including me!) were unaware that this tag even existed. Luckily, Lucas Nussbaum clued me in to the existence of this tag, and after a brief bike-shed-naming thread, and some voting using pocket_devotee we decided to name the new tag newcomer, and I have now added this tag to the BTS documentation, and tagged all of the bugs which were user tagged "gift" with this tag. If you have bugs in your package which you think are ideal for new contributors to Debian (or your package) to fix, please tag them newcomer. If you're getting started in Debian, and working on bugs to fix, please search for the newcomer tag, grab the helm, and contribute to Debian.

Don Armstrong: Virginia King selected for Debbugs FOSS Outreach Program for Women

I'm glad to announce that Virginia King has been selected as one of the three interns for this round of the FOSS Outreach Program for women. Starting December 9th, and continuing until March 9th, she'll be working on improving the documentation of Debian's bug tracking system. The initial goal is to develop a Bug Triager Howto to help new contributors to Debian jump in and help existing teams triage bugs. We'll be getting in touch with some of the larger teams in Debian to help make this document as useful as possible. If you're a member of a team in Debian who would like this howto to address your specific workflow, please drop me an e-mail, and we'll keep you in the loop. The secondary goals for this project are to:

19 September 2014

Dariusz Dwornikowski: getting real "done date" of a bug from Debian BTS

As I wrote in my last post currently, SOAP interface, nor Ultimate Debian Database do not provide a date when a given bug was closed (done date). It is quite hard to calculate statistics on a bug tracker when you do not know when a bug was closed (!!). Done date of bug can be found in its log. The log itself can be downloaded by SOAP method get_bug_log but the processing of it is quite complicated. The same comes to web scrapping of a BTS's web interface. Fortunatelly the web interface gives a possibility to download a log in an mbox format. Below is a script that extracts the done date of a bug from its log in mbox format. It uses requests to download the mbox and caches the result in ~/.cache/rfs_bugs, which you need to create. It performs different checks:
  1. Check existence of a header e.g. Received: (at 657783-done) by; 29 Jan 2012 13:27:42 +0000
  2. Check for header CC: NUMBER-close done
  3. Check for header TO: NUMBER-close done
  4. Check for Close: NUMBER in body.
The code is below:
import requests
from datetime import datetime
import mailbox
import re
import os
import tempfile
def get_done_date(bug_num):
    CACHE_DIR = os.path.expanduser("~") + "/.cache/rfs_bugs/"
    def get_from_cache():
        if os.path.exists(" ".format(CACHE_DIR, bug_num)):
            with open(" ".format(CACHE_DIR, bug_num)) as f:
                return datetime.strptime(f.readlines()[0].rstrip(), "%Y-%m-%d").date()
            return None
    done_date = get_from_cache()
    if done_date is not None:
        return done_date
        r = requests.get(";bug= ;mboxstatus=yes".format(self._num))
        d = try_header(r.text)
        if d is None:
            d = try_cc(r.text)
        if d is None:
            d = try_body(r.text)
        if d is not None:
            with open(" ".format(CACHE_DIR, bug_num), "w") as f:
                f.write(" ".format(
            return None
    def try_body(text):
        reg = "\(at\s.+\)\s+by\sbugs\.debian\.org;\s(\d 1,2 \s\w\w\w\s\d\d\d\d)"
        handle, name = tempfile.mkstemp()
        with open(name, "w") as f:
        mbox = mailbox.mbox(name)
        for i in mbox.items():
            if i[1].is_multipart():
                for m in i[1].get_payload():
                    if "close" in str(m) or "done" in str(m):
                            result =, i[1]['Received'])
                            return datetime.strptime(, "%d %b %Y")
                            return None
                if "close" in i[1].get_payload() or "done" in i[1].get_payload():
                        result =, i[1]['Received'])
                        return datetime.strptime(, "%d %b %Y")
                        return None
        return None
    def try_header(text):
        reg = "Received:\s\(at\s\d\d\d\d\d\d-(close done)\)\s+by.+"
            result =, r.text)
            line =
            reg2 = "\d 1,2 \s\w\w\w\s\d\d\d\d"
            result =, line)
            d = datetime.strptime(, "%d %b %Y")
            return d
            return None
    def try_cc(text):
        reg = "\(at\s.+\)\s+by\sbugs\.debian\.org;\s(\d 1,2 \s\w\w\w\s\d\d\d\d)"
        handle, name = tempfile.mkstemp()
        with open(name, "w") as f:
        mbox = mailbox.mbox(name)
        for i in mbox.items():
            if ('CC' in i[1] and "done" in i[1]['CC']) or ('To' in i[1] and "done" in i[1]['To']):
                    result =, i[1]['Received'])
                    return datetime.strptime(, "%d %b %Y")
                    return None
if __name__ == "__main__":
    print get_done_date(752210)
PS: I hope that the script will be not needed in the near future, as Don Armstrong plans a new BTS database, a Debconf14 video is here.

8 September 2014

Jaldhar Vyas: Debconf 14 - Days 1 and 2

Unfortunately I was not able to attend debconf this year but thanks to the awesome video team the all the talks are available for your viewing pleasure. In order to recreate an authentic Portland experience, I took my laptop into the shower along with a vegan donut and had my children stand outside yelling excerpts from in whiny Canadianesque accents. Here are some notes I took as I watched the talks. Welcome Talk
Debian in the Dark Ages of Free software - Stefan Zacchiroli Weapons of the Geek - Gabriella Coleman -- Database Ho! - Don Armstrong Grub Ancient and Modern - Colin and Watson One year of fedmsg in Debian - Nicolas Dandrimont Coming of Age: My Life with Debian - Christine Spang Status report of the Debian Printing Team - Didier Raboud

2 August 2014

Don Armstrong: ErgoDox keyboard assembly

I routinely use a Kinesis Advantage Pro keyboard, which is a split, ergonomic keyboard with thumb clusters that uses brown cherryMX switches. Over the thirteen years that I've been using it, I've become a huge fan of this style of keyboard. However, I have two major annoyances with the Kinesis. First, while the firmware is good, remapping the keys is complicated and producing more complicated keyboard layouts with layers and keycodes that are not present in the original layout is not possible. Secondly, the interconnect between the main key wells and the controller board in the middle occasionally fails, and requires disassembly and occasional re-tinning of the circuit board interconnect connector. 1 About a year ago, I became aware of the ErgoDox keyboard, which is a keyboard design which mimics the kinesis to some degree, but with completely separated key halves (useful, because I'm substantially bigger than the average human), programmable firmware (so I can finally have the layers and missing keys) and with slightly more elegant interconnects (TRRS cables). Unfortunately, at the time I first heard about it (and other custom keyboards), making it required sourcing circuit boards, parts, and finding someone to cut a case for the keyboard. Then, a few months ago, I learned about MassDrop, a company who puts together groups of people to do buys of products at near-wholesale level prices, and their offer of all of the parts to build an ErgoDox. After waiting for a group buy of the keyboard to become available, I put in an order, and received the parts two months later. Over a few hours yesterday, I learned how to do surface mount soldering of the 78 diodes (one for each key), and finished assembling and flashing the firmware. This morning, I fixed up the few key bindings that I needed to be productive, and viola, my laptop at home now has a brand new ergonomic keyboard.

30 May 2014

Don Armstrong: Dropbox Recursive Downloader

I'm working on some analyses for the Genetic Analysis Workshop #19, which has placed it's data on Dropbox. Unfortunately, Dropbox doesn't allow for people to download zip archives larger than 1GB, and the data was made available in an unpacked structure with more than a hundred files. Some searching indicated that no one had written a recursive downloader for Dropbox, so 30 minutes of hacking with WWW::Mechanize later, I wrote a simple recursive downloader for Dropbox. Two hours later, all of the files had downloaded.

25 February 2014

Don Armstrong: Debian Booth at Scale 12x

I spent the weekend at SCALE 12x running the Debian booth. SCALE is one of the best conferences that I get to attend every year; it has a great mix of commercial exhibitors and community groups, and routinely gets great speakers. As I've done for quite some time, I organized a Debian booth there, and talked to lots of people about Debian. If you're in the Southern California area, or have a chance to give a talk for SCALE 13x, you should do so! Thanks again to Matt Kraai and Paul Hardy for helping out in the Debian booth all weekend!

12 February 2014

Don Armstrong: Working with Org-mode: Committing Changes Everywhere

I'm a huge fan of Org-mode, and I keep all of my org-mode files in git repositories which are under myrepos control. However, because I often make lots of changes to my agenda and notes, I hate having to manually visit each individual project and make changes to it. [And it's also annoying when I forget to commit a specific change and then have to try to get my laptop and desktop back into sync.] Luckily, myrepos can easily run a command in parallel in all of the repositories! The following "update_org_files" command will update all of my org-file containing repositories in parallel:
ORG_GREP='-e .org$ -e .org_archive$ -e .org_done$'
if [ "x$1" == "xdoit" ]; then
    if git status --porcelain -z   grep -z '^ M'   grep -zq $ORG_GREP; then
        git status --porcelain -z   grep -z '^ M'   grep -z $ORG_GREP   \
            sed -z 's/^ M//g'   \
            xargs -0 git commit -m'update org files'
        git push;
    emacsclient -n -e '(org-save-all-org-buffers)' >/dev/null 2>&1
    mr -d ~ -j5 run update_org_files doit;
An updated version of this lives in my git repository

19 November 2013

Raphaël Hertzog: Will Debian s technical committee coopt Keith Packard or Philipp Kern?

The process has been ongoing for more than a year but the Debian technical committee is about to select a candidate to recommend for its vacant seat. The Debian Project Leader will then (likely) appoint him (looks like it won t be a women). According to recent discussions on, it seems that either Keith Packard or Philipp Kern will join the committee. If you look at the current membership of the committee, you will see: That s very Anglo-Saxon centric (6 out of 7 members). While I trust the current members and while I know that they are open-minded people, it still bothers me to see this important body with so few diversity. Coming back to the choice at hand, Keith Packard is American and Philipp Kern is German. No new country in the mix. I can only hope that Philipp will be picked to bring some more balance in the body.

9 comments Liked this article? Click here. My blog is Flattr-enabled.

17 October 2013

Don Armstrong: Using Mutt with Org Mode (with refile)

I use org mode extensively, and had added Zack's workflow for integrating mutt with org mode to my ~/.emacs some time ago. However, I've been annoyed that refiling closes the org-capture frame before refiling finishes. The following trivial modification to Zack's code (which I previously modified to work with org-mode >= 0.8) waits to close the frame until you've finished refiling.
 (require 'org-protocol)
 (add-hook 'org-capture-mode-hook 'delete-other-windows)
 (setq my-org-protocol-flag nil)
 (defadvice org-capture-finalize (after delete-frame-at-end activate)
   "Delete frame at remember finalization"
   (progn (if my-org-protocol-flag (delete-frame))
          (setq my-org-protocol-flag nil)))
 (defadvice org-capture-refile (around delete-frame-after-refile activate)
   "Delete frame at remember refile"
   (if my-org-protocol-flag
         (setq my-org-protocol-flag nil)
 (defadvice org-capture-kill (after delete-frame-at-end activate)
   "Delete frame at remember abort"
   (progn (if my-org-protocol-flag (delete-frame))
          (setq my-org-protocol-flag nil)))
 (defadvice org-protocol-capture (before set-org-protocol-flag activate)
   (setq my-org-protocol-flag t))
Now, the frame automatically disappears after you refile it, keeping my clean.

15 May 2013

Benjamin Mako Hill: Sounds Like a Map

Colored visualization of the puzzle. I love maps something that became clear to me when I was looking at the tag cloud of my bookmarks a few years back. One of my favorite blogs (now a book) is Frank Jabobs Strange Maps. So it s no coincidence that a number of my favorite MIT Mystery Hunt puzzles are map based. Trying to connect the two worlds, I sent Jacobs a write-up of the hunt and of a particularly strange sound-based map puzzle called White Noise that I worked with Don Armstrong to solve in the 2006 hunt. While I wasn t paying attention, Jacobs did a very nice writeup of my writeup of the puzzle for Strange Maps!

14 March 2013

Don Armstrong: libravatar for the BTS (and boring encoding fixes)

While working on fixing a few encoding problems that I managed to introduce to the BTS almost half a year ago, I took a side bit of coding, and introduced libravatar support to the BTS. Every e-mail now has an avatar to the right which should correspond to the sender. Libravatar is a federated service, which means that if you control your domain, you can serve your own icons. It also automatically falls back to gravatar, so if you're using that service, things should "just work". Hopefully this will be primarily amusing, and people won't abuse it. More importantly, but much less fun, the double encoding problems (where mails would get double-encoded if any of the headers contained non-us-ascii text), and mojibake wontfix icon ( ) should be fixed now. If you see any additional cases of this, please report them to