Search Results: "Michael Stapelberg"

13 July 2013

Michael Stapelberg: Survey answers part 3: systemd is not portable and what this means for our ports

This blog post is the third of a series of posts dealing with the results of the Debian systemd survey. I intend to give a presentation at DebConf 2013, too, so you could either read my posts, or watch the talk, or both :-). The second-biggest concern in the survey results was that systemd is not portable to non-Linux systems, for example Debian GNU/kFreeBSD or Debian GNU/HURD. For convenience, I will from now on write kFreeBSD when I really mean all non-Linux ports. systemd not being portable is not an arbitrary decision Some people seem to think that the systemd upstream is just hostile to users of other operating systems when they hear that systemd is not portable. However, keep in mind that Lennart, Kay and other contributors have considerable experience with writing portable software such as Avahi and PulseAudio. The decision to only support Linux in systemd was thus not taken lightly. systemd s design requires many kernel features and certain semantics (e.g. procfs is not enough, /proc/$PID/exe needs to be supported), which are currently only available on Linux. Point 15 of Lennart s blog post 0pointer.de/blog/projects/the-biggest-myths.html contains an incomplete list of these features. Maintaining portable code increases complexity Since systemd is written in C, the canonical way to write portable code is by using conditional compilation, for example with ifdef statements. That makes the code harder to understand and reason about, but more importantly it blows up the test matrix. It also requires any new change to be tested on all supported systems, which is not feasible for most contributors. I think everybody agrees keeping complexity low is a good thing. What are the implications for our non-Linux ports? We, the Debian project, have two realistic options in my point of view:
  1. We stay with sysvinit, the least common denominator, forever.
  2. We use a modern init system such as systemd on Debian GNU/Linux.
In case we go with the first option, Debian will diverge more and more from virtually every other Linux distribution. This also means we are stuck with the limited features and capabilities that sysvinit has. A modern operating system needs to be able to adapt to a changing environment and the once static world in which sysvinit was developed has become much more dynamic. In our opinion, the only reasonable choice is the latter option: use systemd on Debian GNU/Linux. How will this work? For individual maintainers, this means they need to support two init systems. Luckily, systemd service files are usually really simple, but there still is additional maintenance work such as testing whether your service actually starts when using systemd. We think this maintenance overhead is justified due to the advantages a modern init system brings. Of course, not every maintainer can arrange it to install systemd and test his/her package. You are invited to contact us at pkg-systemd-maintainers@lists.alioth.debian.org at any time and we can help you out. Furthermore, automation can be introduced (and we have a proof of concept) to make it easier to spot mistakes and perform some simple tests, such as whether the service can be started. A concern that was voiced is that as sysvinit usage decreases, the init scripts would bit-rot and stop working at some point. If that happens, we rely on the users to file appropriate bug reports. This is no different from the situation today it is not feasible for maintainers to test every single combination of features all the time. Ports are different and that diversity is good The FreeBSD kernel and the Linux kernel are different, and each kernel provides distinct features that the other kernel does not have. As an example, Linux provides cgroups, which are heavily used by systemd. The FreeBSD kernel in turn offers the packet filter pf , which is not available on Linux. There certainly is value in having common infrastructure. But there also is value in providing the best features that each platform has to offer in case of Linux, that is clearly systemd as an init system IMO. Conclusion systemd is not portable because it relies on features only the Linux kernel provides an example is cgroups, which systemd uses to track processes in a reliable way. Not embracing these features and staying with sysvinit indefinitely is not a viable option if Debian wants to remain relevant for today s demands. In the short term, the migration to systemd will cause additional maintenance effort for individual package maintainers, but it will pay off in the long term.

2 July 2013

Bits from Debian: all Debian source are belong to us

This is a verbatim repost from Stefano Zacchiroli's post TL;DR: go to http://sources.debian.net and enjoy.
Debsources is a new toy I've been working on at IRILL together with Matthieu Caneill. In essence, debsources is a simple web application that allows to publish an unpacked Debian source mirror on the Web. You can deploy Debsources where you please, but there is a main instance at http://sources.debian.net (sources.d.n for short) that you will probably find interesting. sources.d.n follows closely the Debian archive in two ways:
  1. it is updated 4 times a day to reflect the content of the Debian archive
  2. it contains sources coming from official Debian suites: the usual ones (from oldstable to experimental), *-updates (ex volatile), *-proposed-updates, and *-backports (from Wheezy on)
Via sources.d.n you can therefore browse the content of Debian source packages with usual code viewing features like syntax highlighting. More interestingly, you can search through the source code (of unstable only, though) via integration with http://codesearch.debian.net. You can also use sources.d.n programmatically to query available versions or link to specific lines, with the possibility of adding contextual pop-up messages (example). In fact, you might have stumbled upon sources.d.n already in the past few days, via other popular Debian services where it has already been integrated. In particular: codesearch.d.n now defaults to show results via sources.d.n, and the PTS has grown new "browse source code" hyperlinks that point to it. If you've ideas of other Debian services where sources.d.n should be integrated, please let me know. I find Debsources and sources.d.n already quite useful but, as it often happens, there is still a lot TODO. Obviously, it is all Free Software (released under GNU AGPLv3). Do not hesitate to report new bugs and, better, to submit patches for the outstanding ones. Acknowledgements PS in case you were wondering: at present sources.d.n requires ~381 GB of disk space to hold all uncompressed source packages, plus ~83 GB for the local (compressed) source mirror

Stefano Zacchiroli: introducing sources.debian.net

all Debian source are belong to us TL;DR: go to http://sources.debian.net and enjoy.
Debsources is a new toy I've been working on at IRILL together with Matthieu Caneill. In essence, debsources is a simple web application that allows to publish an unpacked Debian source mirror on the Web. You can deploy Debsources where you please, but there is a main instance at http://sources.debian.net (sources.d.n for short) that you will probably find interesting. sources.d.n follows closely the Debian archive in two ways:
  1. it is updated 4 times a day to reflect the content of the Debian archive
  2. it contains sources coming from official Debian suites: the usual ones (from oldstable to experimental), *-updates (ex volatile), *-proposed-updates, and *-backports (from Wheezy on)
Via sources.d.n you can therefore browse the content of Debian source packages with usual code viewing features like syntax highlighting. More interestingly, you can search through the source code (of unstable only, though) via integration with http://codesearch.debian.net. You can also use sources.d.n programmatically to query available versions or link to specific lines, with the possibility of adding contextual pop-up messages (example). In fact, you might have stumbled upon sources.d.n already in the past few days, via other popular Debian services where it has already been integrated. In particular: codesearch.d.n now defaults to show results via sources.d.n, and the PTS has grown new "browse source code" hyperlinks that point to it. If you've ideas of other Debian services where sources.d.n should be integrated, please let me know. I find Debsources and sources.d.n already quite useful but, as it often happens, there is still a lot TODO. Obviously, it is all Free Software (released under GNU AGPLv3). Do not hesitate to report new bugs and, better, to submit patches for the outstanding ones. Acknowledgements PS in case you were wondering: at present sources.d.n requires ~381 GB of disk space to hold all uncompressed source packages, plus ~83 GB for the local (compressed) source mirror

1 July 2013

Michael Stapelberg: Survey answers part 2: the transition

This blog post is the second of a series of posts dealing with the results of the Debian systemd survey. I intend to give a presentation at DebConf 2013, too, so you could either read my posts, or watch the talk, or both :-). It seems that it is unclear how Debian s transition to systemd is intended to work. By transition , we mean going from the current state (sysvinit is the default and fully supported) to systemd is fully supported. Then by merely installing systemd by default and letting it provide /sbin/init, we can make it the default init system. If and when that happens is a different matter and it s not necessary for all packages to have systemd support. sysvinit compatibility systemd natively supports sysvinit scripts, meaning your existing package will work as-is but you cannot utilize all the features that systemd provides. The sysvinit support works very well, as you can try in a fresh Debian wheezy VM. In the output of systemctl list-units , every entry which has an LSB: prefix is actually a sysvinit script. The mechanism with which systemd decides whether to use an init script or a service file is by looking whether a service file with a corresponding name exists. That is, if e.g. apache2.service exists, systemd will prefer it over /etc/init.d/apache2. To make this crystal clear: it is not necessary to ship service files for all services in some kind of flag day. systemd supports a mixed installation where some services use init scripts and some services use service files. Adding systemd support to your package In a nutshell, it usually works like this:
  1. Install a service file to /lib/systemd/system/foo.service.
    Often, upstream already provides and installs a .service file.
    If not, you can place your file at debian/package.service
    Make sure that your service file name corresponds to the sysvinit script name
    (e.g. apache2.service for /etc/init.d/apache2)
  2. Ensure your service file(s) are enabled and started.
    We strongly recommend you to use our package dh-systemd.
    If you use dh(1), add --with=systemd in debian/rules and Build-Dep on dh-systemd
  3. Test your package, see the next section.
For details see wiki.debian.org/Systemd/Packaging. Testing systemd We carefully made sure that you can install the systemd Debian package on your machine alongside sysvinit without breaking anything. The systemd package does not conflict with any other packages, it will not replace /sbin/init and systemd will not be enabled right away. It is only after you specify the kernel parameter init=/bin/systemd in /etc/default/grub that you switch to systemd. In case you want to go back, simply boot without this kernel parameter. Conclusion In conclusion, the transition is straight-forward and the necessary infrastructure is in place. systemd is available in Debian and can be used today. Packages can add systemd support whenever their maintainer(s) feel like it. There is no need for a flag day. We can switch the default whenever we think we are ready.

17 June 2013

Michael Stapelberg: Talk about Debian Code Search

At this year s GPN13 I gave a talk about Debian Code Search. It was in German, so I spent a few hours creating english subtitles. You can watch the video at youtube.com/watch?v=n6DtW3zCTvk with english subtitles. In case you prefer to download the file(s), get http://ftp.ccc.de/events/gpn/gpn13/gpn13-debian-code-search.mp4 (84 MiB) and the corresponding subtitle file at http://t.zekjur.net/gpn13-debian-code-search.srt. Drop both files in the same directory, run mplayer gpn13-debian-code-search.mp4 and press v to enable subtitles. I intend to eventually put the (subtitled) video on YouTube and refer to it from codesearch.debian.net, but I wanted to post the video in its current form already. The presentation itself explains the motivation behind Debian Code Search and how it works. You don t need any knowledge of the system in order to understand the talk. Enjoy!

8 June 2013

Michael Stapelberg: Survey answers part 1: systemd has too many dependencies, or it is bloated, or it does too many things, or is too complex

This blog post is the first of a series of posts dealing with the results of the Debian systemd survey. I intend to give a presentation at DebConf 2013, too, so you could either read my posts, or watch the talk, or both :-). The top concern shared by most people is:
systemd has too many dependencies, or it is bloated, or it does too many things, or is too complex
Now this concern actually has a lot of different facets, and I am trying to share my opinion on each of them. systemd has too many dependencies First, let s start with too many dependencies , because that is easy to check and reason about. I have created a document which lists all dependencies of the systemd binary itself (pid 1) and all the binaries which are currently shipped by the systemd Debian package. If you don t want to take my word for granted, please read that document. Have you read the document? Very nice! As you can see, the systemd binary itself has 10 dependencies (excluding libc). Now, the question is, what is bad about dependencies? Why do people list dependencies as a top concern?
  1. Cyclic dependencies. When you hear that your init system depends on DBus, you might argue that there is a cyclic dependency here, because DBus needs to be started by the init system. However, systemd does NOT depend on dbus-daemon (!) to boot your machine. Instead of using the system bus, it uses a private UNIX socket. Therefore, systemd uses DBus merely as a serialization format for IPC between its different processes. Only when you want to access systemd via its API as a user (non-root), you actually use the system bus. Since we are talking about DBus: DBus provides a well-tested serialization format and IPC mechanism so that systemd doesn t have to reinvent the wheel and instead benefits from wide support within languages.
  2. Complicated code. I feel like there is the implicit assumption that lots of dependencies correlate with complicated code that is easy to break. I encourage you to have a look at systemd s source code: look for the places where specific libraries are used, e.g. enforce_user which uses libcap. You ll notice that the code is not complex and usage of the libraries is clear.
  3. Software dragging in lots of library packages. The libraries which systemd uses are already in widespread use (e.g. DBus, udev, selinux, libcap, pcre, ). On a typical Debian installation, only very few of them will be dragged in by systemd, if at all. As an example, on a fresh Debian Wheezy installation, less than 10 packages will end up on your machine when running apt-get install systemd .
  4. More memory use. The Linux kernel maps libraries into memory only once, no matter how many processes use them. As stated in the dependency list, on machines where the libraries are not already loaded, systemd brings in about 500 KiB of additional memory-mapped libraries in the worst case. On the machines we have these days, this is a reasonable cost to pay for all the benefits systemd gets us. This holds true on embedded systems with only a few MiB of RAM and especially on typical workstations with 8+ GiB of RAM.
systemd is bloated Now, let s talk about bloat. Again, this is a point which has many facets. I d like to quote the Wikipedia definition of software bloat:
Software bloat is a process whereby successive versions of a computer program become perceptibly slower, use more memory or processing power, or have higher hardware requirements than the previous version whilst making only dubious user-perceptible improvements.
The first part of the definition certainly does not match systemd it is measurably faster than sysvinit. As for memory usage: systemd s RSS is 1.8 MiB, whereas sysvinit uses 0.8 MiB. As I argued on the More memory use point in the dependencies section, I think the additional resource cost is well worth the benefits. Also note that systemd s features are NOT all implemented in the binary which is PID 1. As explained in the dependency list, systemd consists of many cleanly separated binaries. So if a new version of systemd gathers an additional feature, this does not mean that your PID 1 will be bigger. While systemd runs on any hardware, it has an indirect hardware requirement: it requires some Linux kernel features (which are all enabled in Debian kernels). That might rule out usage of systemd on really old embedded hardware where you don t have a chance to update the kernel. While it is sad that those machines cannot profit from systemd, switching to systemd as a default has no downside either: Debian continues to support sysvinit for quite some time, so these machines will continue to work even with upcoming Debian versions. systemd does too many things The Wikipedia definition continues:
[ ] perceived bloat can occur from the software servicing a large, diverse marketplace with many differing requirements. Most end users will feel they only need some limited subset of the available functions and will regard the others as unnecessary bloat, even if people with different requirements do use them.
I think the last part of the Wikipedia definition applies to systemd: it does service a large and diverse marketplace . That marketplace is the entirety of existing software which is started by an init system. Also, systemd can be used on a wide range of hardware (embedded devices, tablets, phones, notebooks, desktops, servers) which requires different features. As an example: on a desktop system you typically don t care strongly about a watchdog feature, but on embedded or servers that feature is very handy. Similarly, on a tablet, forward secure sealing of logfiles is not as important as on a server. Therefore, I can understand if you feel that you don t need many of the features systemd provides. But please think of other users and maintainers who are very happy with systemd s benefits. Also note that while systemd supports many things (in separate binaries!), you don t have to use them all. It still makes sense to ship them all in the same package. Take coreutils as another example in that area. The binaries belong together, even though you most likely haven t used all of them (e.g. od, pr, ptx, :-)). systemd is too complex The remaining concern is that systemd is too complex. In my experience, complexity is often inherent to a specific area and one cannot simply make it go away. Instead, there are different models of how that complexity is represented. Think of the monolithic Linux kernel versus the MINIX microkernel. The latter has a very small amount of lines of source in the kernel, but puts the complexity into userspace. The former uses a different approach with more source in the kernel. The arguments between both camps show that neither is clearly right or clearly wrong. In a way, sysvinit represents the MINIX model: it has a small core (the init binary itself), but a lot of complexity in shell scripts and external programs. The fact that solutions are copied from one init script to another leads to lots of subtle errors and makes code reuse really hard. systemd however has more source code in the binaries, but requires only very simple, descriptive, textual configuration instead of complex init scripts. To me, it seems preferable to have the complexity in a single place instead of distributed across lots of people and projects. Conclusion In a way, you are right. systemd centralizes complexity from tons of init scripts into a single place. However, it therefore makes it very easy for maintainers to write service files (equivalent of an init script) and provides a consistent and reliable interface for service management. Furthermore, it is different than sysvinit, and different solutions often seem complex at first. While systemd consumes more resources than sysvinit, it uses them to make more information available about services; its finer-grained service management requires more state-keeping, but in turn offers you more control over your services.

Michael Stapelberg: Uploading packages via SFTP

Yesterday I uploaded a big package and got multiple timeouts. I then figured out that DDs can also upload using SFTP (i.e. SSH s file transfer thingie) instead of traditional FTP, which seems like a more modern alternative. So let s give that a try. With dput-ng, the following configuration leads to using sftp by default:
mkdir -p ~/.dput.d/profiles/
cat > ~/.dput.d/profiles/ftp-master.json <<EOT
 
    "fqdn": "ssh.upload.debian.org",
    "incoming": "/srv/upload.debian.org/UploadQueue/",
    "method": "sftp"
 
EOT
Note that uploading via SFTP will lead to debianqueued uploading the files via FTP for you. But maybe that is more reliable than doing it yourself. We ll see :-).

27 May 2013

Michael Stapelberg: Results of the Debian systemd survey

A week ago, we started the Debian systemd survey. The goal was to figure out a few trends and answer the following two questions:
  1. Do our subjective impressions from the discussions on debian-devel reflect the general sentiment about systemd?
  2. What are the main concerns that most people have?
Thank you all for your participation! General I am happy to tell that 2113 people had a look at the survey and 573 people actually participated. Of those, 45.7% said they are a DD, DM or otherwise maintaining packages. 74.5% said they actually booted and used a computer running systemd. With regards to having systemd in Debian at all, not necessarily as the default init system, answers are as follows: 43.9% said they personally want systemd as the default init system in Debian, while 32.2% don t want that. The remaining 23.7% don t know yet. Top concerns About 50% of participants provided one or more concerns. Evaluating this part of the survey is obviously the hardest, since the answers were provided as free text. I tried categorizing the feedback into a few buckets. Each bucket comes with a number which is a weighted (!) count, i.e. the top concern counts 3 times, the second concern counts 2 times and the last concern just once. If you cannot find your personal concerns, that is because those which were just voiced by less then five people are not listed here. As I said, we cared about trends in this survey, not individual opinions.
  1. systemd is too complex, or bloated, or it does too many things, or has too many dependencies (weight: 217)
  2. systemd is not portable to non-linux systems, e.g. Debian/kFreeBSD or HURD (weight: 199)
  3. Debugging the boot process is harder than in sysvinit (weight: 106)
  4. I have a problem with systemd upstream and/or Lennart in particular (weight: 87)
  5. systemd violates the UNIX philosophy (weight: 35)
  6. I dislike binary logs and/or the journal in general (weight: 30)
  7. systemd is too new, still untested, unstable and experimental (weight: 24)
  8. I cannot find enough documentation in general or about the transition from sysvinit to systemd in particular (weight: 20)
  9. People need to learn new commands/how the new system works (weight: 20)
  10. I don t know what problems systemd solves and/or have no problems with sysvinit currently (weight: 19)
  11. I think the configuration is binary and/or the configuration format is weird (weight: 13)
  12. systemd s code is not good (memory leaks, performance problems) and/or upstream s development lacks best practices such as long-term stable branches (weight: 9)
  13. There are problems with systemd in Debian (e.g. Debian-specific extensions to /etc/fstab not supported, system doesn t boot) (weight: 9)
  14. Customizing the boot is harder than with sysvinit (weight: 8)
  15. systemd is not compatible with sysvinit (weight: 8)
I realize that the choice of buckets is somewhat broad and there are many different, finely nuanced opinions out there. Again, this is for seeing a trend, not accounting for every single individual opinion. In case you strongly believe that I did something wrong when evaluating the survey results, please contact me and we can work something out. Conclusions/Actions I know this is a controversial topic. Please don t start yet another systemd discussion on debian-devel. We, the systemd Debian maintainers, will try to come up with good answers to all listed concerns and communicate them in a friendly and concise way soon. Furthermore, we have recognized the need for more documentation/information about how things are supposed to work with regards to the transition (and systemd in general) and will address that, too. Raw questions/numbers
Total records in survey: 573
Are you a Debian Developer (DD) or Debian Maintainer (DM) or otherwise currently maintaining Debian packages?
262 Yes
311 No
Did you ever boot and use a computer running systemd? It does not matter whether that computer was running Debian or a different operating system.
427 Yes
146 No
What is your general sentiment towards having systemd in Debian (not necessarily as default)?
358 I welcome systemd in Debian, everything is fine.
 81 I am not sure yet.
 46 I don t care.
 88 I don t want systemd in Debian.
Would you personally want systemd as the default init system in Debian?
252 Yes
136 I don t know
185 No

25 May 2013

Michael Stapelberg: Cloning git-buildpackage repositories

Whenever I want to work on some package, I usually clone its git repository, make my changes, then push and upload the Debian package. I don t keep those repositories around in order to avoid cruft and also to have a 100% clean, up-to-date setup whenever I start working on something. Everytime I clone such a repository, I struggle with the setup. For example, I usually forget the --pristine-tar flag for gbp-clone. Also, I usually forget to push other branches (working on debian , forgetting upstream ) and, even more often, I forget pushing tags. I spent some time on this and figured out that one can use the following to make --pristine-tar the default:
cat >> ~/.gbp.conf <EOF
[DEFAULT]
pristine-tar = True
EOF
Avoiding my other pain points is not so easy apparently, so I wrote a little shell function (only tested with zsh!) which uses debcheckout to get the git URL, gbp-clone to actually clone it and a few git config calls to make me able to just git push when I am done and not worry about anything:
# Clones the git sources of a Debian package
# needs debcheckout from devscripts and gbp-clone from git-buildpackage
function d-clone()  
    local package=$1
    if debcheckout --print $package >/dev/null
    then
        set -- $(debcheckout --print $package)
        if [ "$1" != "git" ]
        then
            echo "$package does not use git, but $1 instead."
            return
        fi
        echo "cloning $2"
        gbp-clone $2   return
        # Change to the newest git repository
        cd $(dirname $(ls -1td */.git   head -1))   return
        # This tells git to push all branches at once,
        # i.e. if you changed upstream and debian (after git-import-orig),
        # both upstream and debian will be pushed when running  git push .
        git config push.default matching   return
        # This tells git to push tags automatically,
        # so you don t have to use  git push; git push --tags .
        git config --add remote.origin.push "+refs/heads/*:refs/remotes/origin/*"   return
        git config --add remote.origin.push "+refs/tags/*:refs/tags/*"   return
        echo "d-clone set up everything successfully."
    else
        echo "debcheckout $package failed. Is $package missing Vcs tags?"
    fi
 
With that function, starting work on a package becomes as easy as d-clone golang-doc .

20 May 2013

Michael Stapelberg: Debian systemd survey

In the past, we have had multiple heated discussions involving systemd. We (the pkg-systemd-maintainers team) would like to better understand why some people dislike systemd. Therefore, we have created a survey, which you can find at http://survey.zekjur.net/index.php/391182 Please only submit your feedback to the survey and not this thread, we are not particularly interested in yet another systemd discussion at this point. The deadline for participating in that survey is 7 days from now, that is 2013-05-26 23:59:00 UTC. Please participate only if you consider yourself an active member of the Debian community (for example participating in the debian-devel mailing list, maintaining packages, etc.). Of course, we will publish the results after the survey ends. Thanks! Best regards,
the Debian systemd maintainers

12 May 2013

Ian Campbell: qcontrol 0.5.1

I've just released qcontrol 0.5.1. Changes since the last release: I also put together a very basic homepage. Get it from gitorious or http://www.hellion.org.uk/qcontrol/releases/0.5.1/. The Debian package will be uploaded shortly.

20 April 2013

Ulrich Dangel: Analyzing rc bug messages

Michael Stapelberg recently posted a blog post about looking into the number of Debian Developers actively working on RC bugs for the upcoming wheezy release. In this blog post I analyze the data shared by Michael and provide the R commands used to generate the plots & findings. If you are interested into looking into the data yourself, but don t like R, I suggest using ipython notebook + numpy instead.

Analysis After parsing the data file we typically want to get an understanding of the data, by using summary(bugs) we get the minimum(1), median(5), mean(15.4), max(716) and quantiles of the data. This shows that the number of messages is wide-spread and a few people contribute a lot. To visualize the dispersion of the data we can create a box plot showing the range of messages: boxplot As the first and third quantile are close together we can assume that the majority of the work is done by a few, especially since the second quantile is 5. This is supported by the histogram below, where the x axis is the number of recorded messages and y is the number of developers. histogram

Top 10 contributors The TOP 10 contributors, according to the dataset, are:
  1. Lucas Nussbaum - 716 messages
  2. Gregor Herrmann - 270 messages
  3. Jakub Wilk - 270 messages
  4. Andreas Beckmann - 225 messages
  5. Julien Cristau - 205 messages
  6. Cyril Brulebois - 169 messages
  7. Moritz Muehlenhoff - 162 messages
  8. Michael Biebl - 159 messages
  9. Salvatore Bonaccorso - 158 messages
  10. Christoph Egger - 142 messages

r commands These are the commands used to generate the plots and information in this plot:
bugs <- read.csv("by-msg.csv")
summary(bugs)
boxplot(bugs$rcbugmsg, log='y', range=0, ylab="# bugs")
quantile(bugs$rcbugmsg)
0%  25%  50%  75% 100%
1    2    5   12  716
# create histogram
llibrary('ggplot2')
ggplot(bugs, aes(x=rcbugmsg)) + geom_histogram(binwidth=.5, colour="black", fill="black") + scale_x_sqrt()
top10 <- tail(bugs[order(bugs$rcbugmsg),], 10)
top10

Ulrich Dangel: Analyzing rc bug messages

Michael Stapelberg recently posted a blog post about looking into the number of Debian Developers actively working on RC bugs for the upcoming wheezy release. In this blog post I analyze the data shared by Michael and provide the R commands used to generate the plots & findings. If you are interested into looking into the data yourself, but don t like R, I suggest using ipython notebook + numpy instead.

Analysis After parsing the data file we typically want to get an understanding of the data, by using summary(bugs) we get the minimum(1), median(5), mean(15.4), max(716) and quantiles of the data. This shows that the number of messages is wide-spread and a few people contribute a lot. To visualize the dispersion of the data we can create a box plot showing the range of messages: boxplot As the first and third quantile are close together we can assume that the majority of the work is done by a few, especially since the second quantile is 5. This is supported by the histogram below, where the x axis is the number of recorded messages and y is the number of developers. histogram

Top 10 contributors The TOP 10 contributors, according to the dataset, are:
  1. Lucas Nussbaum - 716 messages
  2. Gregor Herrmann - 270 messages
  3. Jakub Wilk - 270 messages
  4. Andreas Beckmann - 225 messages
  5. Julien Cristau - 205 messages
  6. Cyril Brulebois - 169 messages
  7. Moritz Muehlenhoff - 162 messages
  8. Michael Biebl - 159 messages
  9. Salvatore Bonaccorso - 158 messages
  10. Christoph Egger - 142 messages

r commands These are the commands used to generate the plots and information in this plot:
bugs <- read.csv("by-msg.csv")
summary(bugs)
boxplot(bugs$rcbugmsg, log='y', range=0, ylab="# bugs")
quantile(bugs$rcbugmsg)
0%  25%  50%  75% 100%
1    2    5   12  716
# create histogram
llibrary('ggplot2')
ggplot(bugs, aes(x=rcbugmsg)) + geom_histogram(binwidth=.5, colour="black", fill="black") + scale_x_sqrt()
top10 <- tail(bugs[order(bugs$rcbugmsg),], 10)
top10

30 March 2013

Michael Stapelberg: Analyzing RC bug messages

Recently, I was wondering how many Debian Developers are actively working on RC bugs in some way or another in the time period of the last release (squeeze) to now (shortly? before wheezy). I therefore grabbed the mailing list archives of debian-bugs-dist@ from gmane, used only those messages whose X-Debian-PR-Message header matches an RC bug (list retrieved from UDD) and then attributed the message counts to the appropriate Debian Developer. I am sure that there are subtle mistakes in the data I retrieved and that there most likely is a better way to achieve the same results, but this is only to get a trend and should be good enough for that. So, it turns out that 514 different Debian Developers have sent messages regarding RC bugs since squeeze. That s about half of our 985 total active Debian Developers. I would love to show a good histogram of the actual message counts (not sure how well received showing the raw data is ), but I suck at visualizing such data in a compact way. Is there anyone familiar with R (or other free data visualization tools) and willing to help? I can send you the CSV file, just send me an email to stapelberg@.

8 March 2013

Michael Stapelberg: Replying to Debian BTS messages in notmuch

Previously, my workflow regarding replying to bugreports outside my own packages was very uncomfortable: I first downloaded the mbox archive from the BTS, then imported that in claws-mail, hit reply all, remove submit@, add bugnumber@, then send the email. Therefore, I decided to hack up a little elisp function to automate this process for notmuch. It first downloads the message from the BTS, adds it to the notmuch database, then calls notmuch-mua-reply on the message and fixes the To: header:
;; Removes submit@bugs.debian.org from the recipients of a reply-all message.
(defun debian-remove-submit (recipients)
  (delq nil
	(mapcar (lambda (recipient)
		  (and (not (string-equal (nth 1 recipient) "submit@bugs.debian.org"))
		       recipient))
		recipients)))
(defun debian-add-bugrecipient (recipients bugnumber)
  (let* ((bugaddress (concat bugnumber "@bugs.debian.org"))
	 (addresses (mapcar (lambda (x) (nth 1 x)) recipients))
	 (exists (member bugaddress addresses)))
    (if exists
	recipients
      (append (list (list (concat "Bug " bugnumber) bugaddress)) recipients))))
;; TODO: msg should be made optional and it should default to the latest message in the bugreport.
;; NB: bugnumber and msg are both strings.
(defun debian-bts-reply (bugnumber msg)
  ;; Download the message to ~/mail-copy-fs/imported.
  (let ((msgpath (format "~/mail-copy-fs/imported/bts_%s_msg_%s.msg" bugnumber msg)))
    (let* ((url (format "http://bugs.debian.org/cgi-bin/bugreport.cgi?msg=%s;mbox=yes;bug=%s" msg bugnumber))
	   (download-buffer (url-retrieve-synchronously url)))
      (save-excursion
	(set-buffer download-buffer)
	(goto-char (point-min)) ; just to be safe
	(if (not (string-equal
		  (buffer-substring (point) (line-end-position))
		  "HTTP/1.1 200 OK"))
	    (error "Could not download the message from the Debian BTS"))
	;; Delete the HTTP headers and the first "From" line (in order to
	;; make this a message, not an mbox).
	(re-search-forward "^$" nil 'move)
	(forward-char)
	(forward-line 1)
	(delete-region (point-min) (point))
	;; Store the message on disk.
	(write-file msgpath)
	(kill-buffer)))
    ;; Import the mail into the notmuch database.
    (let ((msgid (with-temp-buffer
		   (call-process "~/.local/bin/notmuch-import.py" nil t nil (expand-file-name msgpath))
		   (buffer-string))))
      (notmuch-mua-reply (concat "id:" msgid) "Michael Stapelberg <stapelberg@debian.org>" t)
      ;; Remove submit@bugs.debian.org, add <bugnumber>@bugs.debian.org.
      (let* ((to (message-fetch-field "To"))
	     (recipients (mail-extract-address-components to t))
	     (recipients (debian-remove-submit recipients))
	     (recipients (debian-add-bugrecipient recipients bugnumber))
	     (recipients-str (mapconcat (lambda (x) (concat (nth 0 x) " <" (nth 1 x) ">")) recipients ", ")))
	(save-excursion
	  (message-goto-to)
	  (message-delete-line)
	  (insert "To: " recipients-str "\n")))
      ;; Our modifications don t count as modifications.
      (set-buffer-modified-p nil))))
In case you want to get updates, you can find the latest version of this code in my configfiles git repository. To add a single message to the notmuch database and get its message ID, I have written this simple python script (using python-notmuch), located in ~/.local/bin/python-import.py:
#!/usr/bin/env python
# vim:ts=4:sw=4:et
import notmuch
import sys
if len(sys.argv) < 2:
    print "Syntax: notmuch-import.py <filename>"
    sys.exit(0)
db = notmuch.Database(mode=notmuch.Database.MODE.READ_WRITE)
(msg, status) = db.add_message(sys.argv[1])
print msg.get_message_id()
If you have any improvements, I d love to hear about it. If it s useful for you, enjoy.

6 February 2013

Michael Stapelberg: RC bugs

I recently worked on the following RC bugs: By the way, in case anyone needs to reproduce an armhf bug and wants to do so in a chroot with qemu, here are the steps I used:
sudo qemu-debootstrap --arch=armhf --foreign wheezy armhf http://ftp.de.debian.org/debian
sudo LC_ALL=C chroot armhf /bin/bash
echo 'deb http://ftp.de.debian.org/debian wheezy main' > /etc/apt/sources.list
echo 'nameserver 8.8.8.8' > /etc/resolv.conf
apt-get update
apt-get install clang
Also, here are UDD queries which I prefer to those posted by RichiH. Note that they don t display all bugs, but ignore those which were created in the last 7 days.

5 February 2013

Michael Stapelberg: RC bugs

I recently worked on the following RC bugs: It is getting hard to find RC bugs to work on. If you have access to ia64 hardware, have a look at #694971 Epiphany browser crashes within JSC::JSArray::increaseVectorLength() or #697172 sporadic crashes of epiphany browser due to a thread-unsafe favicon database. Both have patches available that just need to be tested.

2 January 2013

Michael Stapelberg: WIT in Debian

I just uploaded wit-2.10a to Debian experimental (it has to pass the NEW queue first, though). WIT (Wiimms ISO Tools) is a set of command-line tools to manipulate Wii and GameCube ISO images and WBFS containers. It is useful (for me) to store backups of my Wii games on a USB hard disk drive. This saves me the optical disc juggling, doesn t wear off the discs as fast and gives faster load times. Here is an example session where I format one partition of the USB hard disk with WBFS and then copy my old WBFS image over to it:
$ wwt format -v --force /dev/sde3
wwt: Wiimms WBFS Tool v2.10a r0 x86_64 - Dirk Clemens - 2013-01-02
FORMAT BLOCK DEVICE /dev/sde3 [172 GiB, hss=512]
** 1 file formatted.
$ wwt add --part /dev/sde3 /media/sde1/wbfs/The\ Legend\ of\ Zelda\ Skyward\ Sword\ \[SOUP01\]/*.wbfs
*****  wwt: Wiimms WBFS Tool v2.10a r0 x86_64 - Dirk Clemens - 2013-01-02  *****
WBFSv1 #1/1 opened: /dev/sde3
 - ADD 1/1 [SOUP01] WBFS:/media/sde1/wbfs/The Legend of Zelda Skyward Sword [SOUP01]/SOUP01.wbfs/#0
* WBFS #1: 1 disc added.
wwt add --part /dev/sde3   0,02s user 7,66s system 2% cpu 5:02,68 total
$ wwt list
ID6     1/500 discs (4 GiB)
---------------------------------------------------------------------
SOUP01  The Legend of Zelda Skyward Sword
---------------------------------------------------------------------
Total: 1/500 discs, 4176 MiB ~ 4 GiB used, 171444 MiB ~ 167 GiB free.

5 December 2012

Michael Stapelberg: RC bugs

I recently worked on the following RC bugs:

4 December 2012

Michael Stapelberg: RC bugs

I recently worked on the following RC bugs:

Next.

Previous.