Search Results: "Marc Brockschmidt"

13 March 2011

Lars Wirzenius: DPL elections: candidate counts

Out of curiosity, and because it is Sunday morning and I have a cold and can't get my brain to do anything tricky, I counted the number of candidates in each year's DPL elections.

Year	Count	Names
1999	4	Joseph Carter, Ben Collins, Wichert Akkerman, Richard Braakman
2000	4	Ben Collins, Wichert Akkerman, Joel Klecker, Matthew Vernon
2001	4	Branden Robinson, Anand Kumria, Ben Collins, Bdale Garbee
2002	3	Branden Robinson, Rapha l Hertzog, Bdale Garbee
2003	4	Moshe Zadka, Bdale Garbee, Branden Robinson, Martin Michlmayr
2004	3	Martin Michlmayr, Gergely Nagy, Branden Robinson
2005	6	Matthew Garrett, Andreas Schuldei, Angus Lees, Anthony Towns, Jonathan Walther, Branden Robinson
2006	7	Jeroen van Wolffelaar, Ari Pollak, Steve McIntyre, Anthony Towns, Andreas Schuldei, Jonathan (Ted) Walther, Bill Allombert
2007	8	Wouter Verhelst, Aigars Mahinovs, Gustavo Franco, Sam Hocevar, Steve McIntyre, Rapha l Hertzog, Anthony Towns, Simon Richter
2008	3	Marc Brockschmidt, Rapha l Hertzog, Steve McIntyre
2009	2	Stefano Zacchiroli, Steve McIntyre
2010	4	Stefano Zacchiroli, Wouter Verhelst, Charles Plessy, Margarita Manterola
2011	1	Stefano Zacchiroli (no vote yet)

Winner indicate by boldface. I expect Zack to win over "None Of The Above", so I went ahead and boldfaced him already, even if there has not been a vote for this year. Median number of candidates is 4.

10 May 2010

Stefano Zacchiroli: UDD - consolidating bazaar metadata for QA and data mining

Eclectic paper on the Ultimate Debian Database A few months ago, I've co-authored with Lucas a paper on UDD, which has just been presented at this year IEEE's Mining Software Repository conference, continuing my recent tradition of eclectic papers. The paper is titled The Ultimate Debian Database: Consolidating Bazaar Metadata for Quality Assurance and Data Mining and is available for download from my publications page. For Debian people already familiar with UDD there is probably not much to learn from it, as the main target of the paper is the community of scientists doing data mining on software repositories. For them, UDD offers a valuable entry point to Debian "facts", as data sources reflected in the database are easily joinable together and to some extent already validated by other UDD users (e.g. QA people). Nevertheless the first two sections of the paper are probably of more broad interest. There we have given our point of view on the so called Debian Data Hell: why it exists, how it's related to the nature of Debian and similar distros, etc. I've already noted in the past how that is also related to the culture of freedom that in Debian we value not only in our software, but also in our infrastructure and procedures. We should just get rid of a bit of inertia, and total world domination will then be just around the corner :-)

I'm happy to conclude quoting the acknowledgments section of the paper: Acknowledgments

The authors would like to thank all UDD contributors, and in particular: Christian von Essen and Marc Brockschmidt (student and co-mentor in the Google Summer of Code which witnessed the first UDD implementation), Olivier Berger for his support and FLOSSmole contacts, Andreas Tille who contributed several gatherers, the Debian community at large, the "German cabal" and Debian System Administrators for their UDD hosting and support.

18 March 2010

Clint Adams: BSPs and Triage and squeeze

Some people want Debian 6.0 (codename "squeeze") to be released sometime this year. Personally I think there is some kitsch value in releasing it during DebCamp or DebConf, and there is something to be said for doing it just beforehand so that everyone can break unstable into insanity while rolling around on the quad under the hot August sun. Some people want it to happen sooner, possibly as soon as possible. Irrespective of your preferences, or if you care at all, we will repeat the time-honored mantra: it will be released when it's ready. Readiness is somewhat of a subjective measure, but we typically do not consider a potential release ready when the RC bug count is high. There are a lot of RC bugs right now. Typically these are fixed at some point by their maintainers. There are also some self-motivated individuals who enjoy fixing these bugs themselves, and this practice predates the RCBW initiative, though perhaps not to the same level of combined intensity and endurance. Then there are bug-squashing parties, or BSPs. A BSP can be held virtually which means that a bunch of people get together online and say that they are having a party, or it can be at a designated physical space. I attended one of the latter type not too long ago. This BSP was targeted specifically to newbies or people who might feel that they are unqualified to help squash bugs, but wished to learn how to contribute. Some group and one-on-one mentoring occurred, in areas of how packages are put together, how to patch things, how to submit patches to the BTS, how to reproduce bugs, and other more problem-specific topics. I really don't know how successful this endeavor was since getting metrics on the effectiveness of teaching has been a historically difficult problem. On the other hand, we did spend a fair amount of time actually squashing bugs, which is a bit easier to assess. One problem we had was that some people liked to be loud and distracting and demand unreasonable amounts of attention for topics that had nothing to do with bugs, Debian, or free software, but we made progress regardless. Marc Brockschmidt thinks we need more BSPs and more focus on bug triage. Bug triage can be very demotivating and many people dislike doing it. BSPs seem to have powers of motivation through group camaraderie or friendly competition, so I would agree that it's worth experimenting with.

3 March 2010

Andreas Barth: Changes in wanna-build

During recent weeks, not only sbuild and buildd were changed, but also wanna-build. Many changes were small and don't have direct impact, but will ease our life in future. This includes a bunch of code cleanups. Most changes were done by Kurt Roeckx and me, but as usual Marc Brockschmidt and Philipp Kern were also involved. This round of changes was started with redoing our priority calculation. Up to now, any package had a fixed place in the list, and our list was built from top to bottom as far as buildd power was available (putting aside manual intervention by setting build-priority by admins). That meant of course that some packages could be stalled if buildd time isn't enough anymore, like currently on mipsen. (The queue order was determined by the following sort options: build-priority, (>= standard?), already built in the past, priority, section, name, and the first difference decided list order.) Now, of course >= standard packages are still built first, but waiting days increase priority so that old extra packages could be built before young optional package (in other words, they shouldn't stall. The new formula is about: required: 50, important: 40, standard: 30, optional: 5 [priority] + libs: 4, devel: 2 [section] + contrib: -20, non-free: -40 [component] + out-of-date: 20 [notes] + max(6, waitingdays) * 2 + manual priorities, and packages are ordered by this number, then by waitingdays, then by name.) While adding code to add bonus for long-waiting packages, we stumbled across the fact that there were non-C dates in the database stored, which in turn means that export of the database stopped to work. For fixing that we replaced the last change field in the database by an postgres now() on insert, and converted that field to an date field (instead of freetext). Which in turn broke mkstats and a few more things, which are fixed as of now. While doing that, we also introduced the format option, which allows to do queries like:

wanna-build --format='%t %u: %p/%v% +b B%B' -A mipsel --list=building

which gives output like:

2010-03-03 15:24:38.642988 buildd_mipsel-mayer: cracklib2/2.8.16-1
2010-03-03 15:30:00.341313 buildd_mipsel-rem: liblouisxml/2.1.0-1+b1

Of course, there are even better possibilities what one could do with that. :) More changes are pending, like the injector for log files was changed so that we record building times in the database. This will allow us to include build time on at least a few buildds, so that large packages cannot so easily stall all buildds completely anymore. So, more to come ...

11 February 2010

Alexander Reichle-Schmehl: [Updated] RC-Bug statistics for Squeeze, calendar week 6:

Back in the Lenny release cycle, I published some statistics of the release critical bugs. As we are (try to) get nearer to a freeze of Squeeze I think it's time to start that again. But first, a small explanation follows, what these numbers actually mean. If you look at our official bug tracking system currently lists 750 release critical bugs affecting the next stable release. While this number is kind of true, it is not accurate to use it to measure the state of the release. The unofficial rc bug thingy at bts.turmzimmer.net is quite more powerful to actual get some interesting numbers, as you can ignore specific kinds of bug and filter them properly. Note, that at the very end of the detailed list of the bugs you get the total number. With that interface you can for example filter them by different distributions. For example it currently lists 650 rc bugs in total for any distribution, but if you filter for squeeze bugs, you get only 518. That's a different number as the official bug tracker shows; I'm not entirely sure why, but a part of that difference is that both the official and the unofficial web pages are only synced periodically. Should someone be able to give a good explanation about the differences, please step forward ;) But let's play more: You can filter for bugs valid in squeeze-only, but not in sid. These bugs are already fixed in sid, but the packages haven't yet migrated to squeeze. Currently that are 94 bugs. We can furthermore ignore bugs only in sid; when squeeze is unaffected, we just don't migrate the broken package from sid to squeeze. The number that is most interesting for contributors is the number of bugs affecting both, sid and squeeze, as these are the ones which really need to be fixed as no fix is known, yet. So filtering for bugs for both we are down to 424 at the moment. But of these 424 remaining release critical bugs, we can still ignore some (for now). For example bugs concerning packages in non-free or contrib (9 currently) won't stop us from release, will they? There are also many bugs marked as pending (32), meaning that the maintainer is aware of the bug has a fix prepared and is just waiting with the upload a moment. Some of this bugs have already a patch (64), but no one reviewed it and uploaded it by now. You'll also see that you can ignore merged bug reports (47). That are bug reports reporting the very same bug several times. Finally there are bug fixes already uploaded to the so called delayed queue (6). This are bug fixes which where uploaded by someone else than the real maintainer, but to give the maintainer a chance to act on himself or to comment, the upload is delayed by some days before it will actually hit the archive. Currently 6 bugs are fixed by uploads to delayed. 17 bugs are claimed meaning that someone already said he will take care of this bug (but they are not finished, yet). You could also ignore the bugs invalidated by today's britney bug category, representing bugs which will vanish after the next migration of packages from sid to freeze, but as we are only looking at rc bugs in both sid and squeeze, this number will always 0 ;) That leaves bugs somehow other marked as fixed which are bugs (39) where I honestly have no Idea what they are... If someone can explain them, please do so ;) Here is a small tabular showing the above numbers:

Total:	650
Affecting Squeeze:	518
Squeeze-only:	94
Unfixed bugs remaining in Squeeze:	424

Of these 424 bugs...

... are pending	32
... are patched:	64
... are duplicates:	47
... are in Non-free or contrib:	9
... are claimed by someone:	17
... are fixed in the delayed queue:	6
... are somehow marked as fixed:	39

Or in other words:
Release critical bugs left in squeeze, when ignoring all these:
260 That's a pretty number, isn't it? There are only 260 bugs left which need our immediate attention :) But that is a rather optimistic view of the situation. We e.g. assume that all bugs having a patch are really fixed by the patch. We also ignored, that some bugs in squeeze are already fixed in sid, but the package can't migrate because the package in sid contains a new bug. (Which happens to release wizard Marc Brockschmidt quite often.) So if we take off the purple classes and take a more pessimistic view (release mangers must be pessimists to ensure high quality packages ;) we count the bugs in squeeze and ignore only those bugs, which are:

hinted / delayed upload (hint means a package is forcefully migrated from sid to squeeze without waiting the usual time span)
merged
contrib
non-free
bugs invalidated by today's britney (fixed in sid, package will migrate soon)

With this release managers views, 424 bugs remain to be fixed somehow before we can release. As far as I know, we can freeze once that number is reliable under 300. So we better get working ;) When asked, what is the thing mostly needed to help the release, I was told: mips porter!
As you might have read we have some problems with the mips and mipsel architectures. It seems we have all the hardware we need, but it doesn't work very reliable. I have been told, that over 300 packages are only waiting to be build on mips* architectures. So if you can help us in that regard, please contact our mips porter mailing list. Update: Andreas Barth just told me, that he already send a couple of mails to the mips list describing the problem and asking questions. Answers to these questions would be most welcome and accpeted gladly.

10 April 2009

Obey Arthur Liu: Google Summer of Code 2009: Debian s Shortlist

Copy of http://lists.debian.org/debian-devel/2009/04/msg00421.html. Hi folks, We have been pretty busy these past few weeks with the whole Google Summer of Code 2009 student application process.
I can say that we have this year a very good set of proposals and I d like to thank all the students and mentors for this. I am going to present to you our shortlist of projects that we would like to be funded and believe we can reasonably manage to get funded. As always, remember that the number of slots is not final yet at this point so we can t promise anything. The first preliminary slot count given today was *10* (same as last year) and we hope to get *2* more (as we did last year). This shortlist is alphabetically ordered because we don t want to reveal the current internal rankings. I am inviting you to debate what you think is cool, what is useful, what is important to Debian, maybe give us pointers to resources or people that could be helpful for the projects. We will try to alter our current rankings to reflect the zeitgeist in Debian, while taking into account the personal information that we have about each student involved. The deadline for any modification is on the 15th, so get everything in by the 14th. The final selected projects will be announced by Google April 20th, ~12 noon PDT / 19:00 UTC. We ll have another announcement then. Three proposals need or may need a mentor, I indicated it. For more information about the projects or mentoring and how to talk to us directly, scroll down past the list. Debian s Shortlist : - Aptitude Package Management History Tracking
- Automatic Debug Packages Creation and Handling
- Debbugs Web UI: Amancay Strikes Back
- Control Files Parsing/Editing Library/Qt4-Debconf Qt4-Perl bindings
- Debian-Installer Support for GNU/kFreeBSD
- KDE/Qt4 Adept 3.0 Package Manager
- Large Scientific Dataset Package Management
- MIPS N32 ABI Port
- MTD Embedded Onboard flash Partitioning and Installation
- On-demand Cloud Computing with Amazon EC2 and Eucalyptus Integration
- Port back update-manager to Debian and all Derivatives
- Debian Autobuilding Infrastructure Rewrite And the details: Aptitude Package Management History Tracking Student: Cristian Mauricio Porras Duarte, Mentor: Daniel Burrows Aptitude currently does not track actions that the user has performed beyond a single session of the program. One of the most frequent requests from users is to find out when they made a change to a package, or why a package was changed; we want to store this information and expose it in the UI in convenient locations. As a side effect, this might also provide some ability to revert past changes. Automatic Debug Packages Creation and Handling Student: Emilio Pozuelo Monfort, Mentor: Marc Brockschmidt This proposal aims at providing debug binary packages for the packages in the Debian archive in an automatic manner, moving them away from the official Debian archive to an special one. This has the benefits of providing thousands of debug packages without any work needed from the developers, for all the architectures, without bloating
the archive. Debbugs Web UI: Amancay Strikes Back Student: Diego Escalante Urrelo, Mentor: Margarita Manterola The Amancay project aims to be a new read/write web frontend to Debian s BTS; allowing DDs and contributors to easily interact with bugs via an intuitive yet powerful interface, enabling new workflows and creating new contribution opportunities like triaging while upholding reporting quality. Control Files Parsing/Editing Library/Qt4-Debconf Qt4-Perl bindings Student: Jonathan Yu, Mentor: (probably) Dominique Dumont see below This project proposes a common library for parsing and manipulating Debian Control files, including control, copyright and changelog. Main ideas include validating and parsing of these files, with both Strict and Quirks modes for the parser. The second idea is a new frontend for Debconf using Qt4 (for which Perl bindings will be written). Debian-Installer Support for GNU/kFreeBSD Student: Luca Favatella, Mentor: Aurelien Jarno GNU/kFreeBSD is currently using a hacked version of the FreeBSD installer combined with crosshurd as its own installer. While this works more or less correctly for standard installations (read: the exact same installation as in the documentation), it does not allow any changes in the installation process except the hard disk partitioning. This project is about porting debian-installer on GNU/kFreeBSD, and to a bigger extent, make debian-installer less Linux dependant. KDE/Qt4 Adept 3.0 Package Manager Student: Mateusz Marek, Mentor: NEEDS MENTOR, see below. Finish Adept 3.0, a fully integrated package manager for Qt4/KDE4. Adept is currently the only viable path to a Debian native package manager on KDE that would support modern features such as tags, indexed search or good conflict resolving. With Aptitude-gtk still in development and only available for GTK+ and (K)PackageKit having fundamental problems, Debian needs this project to stay in control of its package management on KDE after much neglect in the recent years. Large Scientific Dataset Package Management Student: Roy Flemming Hvaara, Mentor: Charles Plessy Large public datasets, like databases for bioinformatics are typically too big and too volatile to fit the traditional source/binary packaging scheme of Debian. There are some programs that are distributed in Debian, like blast and emboss, that can index specialised databases, but Debian lacks a tool to install or update the datasets they need and keep their indexing in sync. MIPS N32 ABI Port Student: Sha Liu, Mentor: Anthony Fok This project first focuses on creating a new MIPS N32 ABI port for Debian. Different from O32 and N64, N32 is an address model which has most 64-bit capabilities but using 32-bit data structures to save space and process time. A second focus will be given on making such a mipsn32el arch fully optimized for the Loongson 2F CPU which gains more and more popularity in subnotebooks/netbooks in many countries. MTD Embedded Onboard flash Partitioning and Installation Student: Per Andersson, Mentor: Wookey Many embedded devices have MTD onboard flash as persistent storage like the Kurobox Pro NAS, the Neo Freerunner, the Sheeva Plug or the OLPC. With MTD flash being so popular and with increases in capacity, support for MTD partition/installation would make Debian even more interesting to a wide range of of devices, making it one step closer to being universal. On-demand Cloud Computing with Amazon EC2 and Eucalyptus Integration Student: David Wendt Jr, Mentor: (probably) Steffen Moeller see below In many academic fields, as well as commercial industries, people use clusters to distribute tasks among multiple machines. Many times this is done by packaging a whole operating system disk image, uploading it onto the cluster, and having the cluster run it in a VM. This project intends to make it easier for Debian to distribute prepared disk images templates like they distribute CD images now, for the users to recreate or customise these templates with Debian packages and for administrators to host such clusters with Debian. Port back update-manager to Debian and all Derivatives Student: Stephan Peijnik, Mentor: Michael Vogt The project would involve taking the distribution-(Ubuntu-)specific update-manager code, analyzing it, and creating a package with just its core functionality, decoupling the distribution-specific parts and thus making the core code extensible by distribution-specific add-ons. This in turn would remove the need of porting update-manager to Debian with every upstream release. An additional optional goal would be replacing the synaptics-backend with a python-apt based one. Debian Autobuilding Infrastructure Rewrite Student: Philipp Kern, Mentor: Luk Claes Rewrite the software that currently runs the Debian autobuilding infrastructure in a way that makes it more maintainable and robust. It will use Python as its programming language and PostgreSQL for the database backend. By harmonizing buildds, many build failures can be prevented and wasteful workload on buildd volunteers can be reduced. On mentoring:

KDE/Qt4 Adept 3.0 Package Manager:

Petr Rockai, the original developer of Adept has offered help to anyone willing to adopt Adept. Sune Vuorela has offered help for any Qt4 and KDE related issues. *We really need a mentor here*. The student is quite competent but Google dictates that we provide a mentor to handle student management.

Control Files Parsing/Editing Library/Qt4-Debconf Qt4-Perl bindings:

Dominique Dumont, although not DD, has signaled interest in mentoring this, although it hasn t been confirmed yet. Sune Vuorela has offered to help co-mentor for the Qt4-Debconf and Qt4-Perl bindings part.

On-demand Cloud Computing with Amazon EC2 and Eucalyptus Integration:

Steffen Moeller has signaled interest in mentoring this, although it hasn t been formally confirmed yet. Charles Plessy of the Debian Med team will provide help for use cases related issues. Eric Hammond, developer of the original vmbuilder image creation tool and maintainer of a set of Debian and Ubuntu images will provide help for Amazon EC2 and image creation issues. Chris Grzegorczyk from the Eucalyptus team will provide help for Eucalyptus and Eucalyptus/Debian integration issues. Contacting us: Considering the tight schedule, most stuff happens live on IRC: #debian-soc on irc.debian.org You can also consult our wiki page for some additional information:
<http://wiki.debian.org/SummerOfCode2009> We have a mailing-list at:
<http://lists.alioth.debian.org/mailman/listinfo/soc-coordination> Keep this discussion on debian-devel@lists.debian.org while cc-ing soc-coordination@lists.alioth.debian.org. This thread is for debian-devel primarily.

2 February 2009

Obey Arthur Liu: Debian Summer of Code 08 : Where are they now (part 3/3)

Welcome back for the last part of the reviews. You may want to look at the previous parts : part 1 and part 2. Jigdo-ivory, a JavaScript Jigdo client Presentation Debian CDs and DVDs take up a huge mount of space on download servers. Using jigdo to download those images can significantly reduce the amount of bandwidth and space needed on the central servers. Unfortunately, jigdo currently needs special client software to be downloaded/installed first. Adding support directly into a browser-based application could potentially make a very big difference for first-time users here. Jigdo was created in 2001. It allowed to create ISOs from .debs grabbed from regular mirrors. It eliminated the need to duplicate the entire contents of the package repository into ISO files for each release, or even more importantly, for weekly snapshots of testing/unstable/whatever. You may find the complete proposal from the student here. The original idea originated from the Debian-CD people, who wanted to explore ideas about creating a light web client. The project was mentored by Steve McIntyre, who developed a new version of the Jigdo tools, jigit, which is much more efficient. Student Dustin Rayner was a 5th year senior undergraduate student at the Oklahoma Christian University in Oklahoma City, Oklahoma. I studied Computer Engineering for 3 years as a Computer Engineering student before deciding to pursue a Mathematics and Computer Science degree. Result This project was unsuccessful due to numerous issues. First, because of an inadequate technical preparation of the original proposal. The Debian-CD people were too optimistic with the possibilities of Javascript. In the end, the copying and checksumming part of the Jigdo process were implemented but the checksumming (with a Javascript implementation of md5) was so slow that it was unusable (think 50kb/s on a regular laptop at full CPU charge). The student did the right thing to investigate Java and ActiveX but it was too late unfortunately and he ultimately lacked the experience and knowledge in the relevant technologies. If the proposal is tried again, the student would be requested to have much more experience with Java (and possibly ActiveX). Those would be much more efficient for the task, as they are the most used technologies among on-line anti-virus scanners, which have a workload somewhat similar to Jigdo. I could not find further public involvement of Dustin Rayner within Debian. Aptitude-gtk, usability and GTK+ GUI for the Aptitude package manager Presentation A GTK+ GUI for Aptitude that will work alongside improved current ncurses and command-line interfaces. This will offer an alternative to Synaptic with an interface design geared toward usability and advanced functionality. Debian currently supports multiple non-command-line package managers, the most used being Synaptic and Aptitude. Synaptic uses a GTK+ interface but offers no command-line mode. Aptitude offers a command-line mode but no X interface, although it offers a ncurses interface.
Comparing the interfaces of Synaptic and Aptitude reveal many design differences. Although Synaptic may be more accessible to beginners, Aptitude offers many interface behaviors and functions that are useful to the regular to advanced users : fully hyperlinked tabbed navigation between packages and versions of packages, mostly modeless interface, interactive dependency conflict resolver The proposal was introduced by the student in coordination with Daniel Burrows, the mentor and developer of Aptitude. Student Obey Arthur Liu was a 22 year old french student of Computer Science and Applied Mathematics at Grenoble Institute of Technology - ENSIMAG, in France. Did I mention that he s also yours truly ? If you want to know more, you might be interested in my previous post. Result This project was successful. The interface was mostly done and functional by the end of the summer. Daniel merged the code into the main post-lenny branch. Development is still ongoing and packages are released into Experimental. For further information, just read the rest of my blog. I could find some further public involvements of Obey Arthur Liu within Debian. Doh! Lintian for fuller automated setups Presentation lintian, the Debian package checker, at the moment presents possible problems in three categories: errors, warnings and informational messages. This leads to several problems, most importantly that the severity and certainty of a check can t be expressed separately. In the course of this project, the student should design and implement in lintian an improvement of the current situation, for example by using a two-letter code (one for certainty, one for severity). This project would make lintian errors much more fine-grained and help in maintaining pertinent quantitative analysis of package quality. The project was mentored by Marc Brockschmidt. The project proposal was commonly introduced by the Lintian team. Student UPDATED: Jord Polo Bard s has done a lot of work with translation in Catalan, his native tongue. He can usually be found on #debian-catalan. He also maintains a few packages as a DM. Result This project was successful. The classification was entirely done. Jord also helped with the new lintian.debian.org website. The Lintian team was very satisfied with the revamped errors list and new website. They have an immediate impact on packages quality reporting. Jord is still active within Debian, helping package a few games. Debexpo, a generic web-based package repository Presentation mentors.debian.net is currently a very specialized web-based repository that allows everybody to contribute software packages to Debian without the need to be a Debian Developer (or Debian Maintainer). It has successfully helped simplifying the sponsoring process in the last years. However it needs to be refactored and in the process should be turned into a generic piece of software that can be used for other Debian source/binary package repositories, too. Mentors is a very good initiative to recruit new packages maintainers (and needs your help!) and the software underlying it could be reused for many different purposes (think PPA). The project was mentored by Christoph Haas. The project proposal was commonly introduced by the mentors team. Student Jonny Lamb was a Computer Science student in the United Kingdom. He was already quite involved within Debian, maintaining a lot of significant packages. Result This project was successful. The whole proposal was perfectly executed. Jonny now continues to develop debexpo, with the mailing-lists and commit logs showing interesting activity. Of course, help for debexpo is appreciated to get it into full shape. Jonny has since become a Debian Developer (here is his AM report). Congratulations to him. It s nice to end on a nice note isn t it ? Now that we re done with the individual reports, I m going to write down my recommendations report. Hopefully it will help with next year s Summer of Code.

21 September 2008

Frank Lichtenheld: Lintian 2.0.0~rc2 in experimental

For the impatient reader: New lintian in experimental, please test and give feedback. You will miss most changes though unless you read the rest of the post (Hint, Hint ;)) During the past week I've uploaded new lintian versions to experimental which we designated to be release candidates for 2.0.0. Code-wise the changes are not that much more intrusive than for many of our past releases, but they change the way lintian classifies tags in a fundamental way, thanks due to the hard work of Jord Polo in his Google Summer of Code project (mentored by Marc Brockschmidt). Lintian Tag Classification, old and new Previously lintian classified tags only in one dimension, in the categories "Info", "Warning", and "Error". While this worked reasonably well, the difference between the categories was not very well defined. The general idea was that everything violating a "must" in Debian Policy or endangering the building or usage of the package should be an "Error", i.e. something very similar to the definition of RC bugs (except that not all "must"s in Policy are deemed worthy of filing RC bugs). Some errors were downgraded to "Warning" or even "Info" though on the basis that their detection was too prone to false positives. Due to this it was a long existing desire to split the classification of tags into two dimensions, one for the impact/importance of the tag, and one for the certainty of its correct detection. This should make it easier for people to interpret and/or filter the output. At various points in the last few years people began to work on this but quickly gave up, usually overwhelmed by the sheer number of tags (728 in 2.0.0~rc2) to classify anew and to make sure that the old and new categorisation could exist side-by-side (because breaking backwards compatibility was not really feasible). Finally this year Jord Polo decided to tackle this task as a Google Summer of Code project, with great success. Tags are now classified in two dimensions "Severity" (with the possible values wishlist, minor, normal, important, serious, which are intentionally very close to the available severities in the Debian bug tracking system), and "Certainty" (possible values: wild-guess, possible, certain). A third classification by "Source" (i.e. Policy, Developers Reference, ...) is planned but not yet fully implemented. For backwards compatibility there is a mapping of these new classification to the old ones (which lead to a few reclassifications of tags). The default output of lintian is unchanged. The new output formats that support the classification are still experimental (see below). How to use it You can specify exactly which levels of Severity and Certainty you want to have displayed with the new --display-level (-L) option. Please see the manual page for the details, but to give you an idea, the default behaviour (i.e. "show warnings and errors" in the "old" vocabulary) is equivalent to specifying

-L ">=important" -L "+>=normal/possible" -L +minor/certain

And to get a report with only severe tags we're very certain of, you could use

-L ">=important/certain"

which will only display tags that have severity "important" or "serious" and a certainty of certain. There is also the (intentionally undocumented) option --exp-output which allows you to play with some experiments we're doing with the output format. --exp-output format=letterqualifier will give you an output very similar to the "classic" one, but with additional information about severity and certainty. --exp-output format=colons gives you a colon-separated format which includes all the possible information lintian currently has available during tag output and which should be easily machine-consumable. Note that these formats are experimental and might be changed at any point without notice. If you're interested in using alternative formats for lintian output, please join the mailing list and talk to us about it. Etc. Other changes include the usual share of bug fixes and of course: New tags

description-contains-dh-make-perl-template
doc-base-uses-applications-section (actually a split of doc-base-unknown-section in two tags)
embedded-pear-module
embedded-php-library
improbable-bug-number-in-closes
maintainer-also-in-uploaders
maintainer-script-ignores-errors
manpage-has-errors-from-pod2man
ored-build-depends-on-obsolete-package (actually a split of build-depends-on-obsolete-package in two tags)
package-superseded-by-perl
versioned-dependency-satisfied-by-perl
windows-devel-file-in-package

Credits This lintian release is brought to you by (sorted by number of changesets):

Jord Polo
Frank Lichtenheld
Adam D. Barratt
Raphael Geissert
Russ Allbery
Niko Tyni
Marc 'HE' Brockschmidt

19 September 2008

Lucas Nussbaum: Looking for cliques in the GPG signatures graph

The strongly connected set of the GPG keys graph contains a bit more than 40000 keys now (yes, that’s a lot of geeks!). I wondered what was the biggest clique (complete subgraph) in that graph, and also of course the biggest clique I was in. It’s easy to grab the whole web of trust there. Finding the maximum clique in a graph is NP-complete, but there are algorithms that work quite well for small instances (and you don’t need to consider all 40000 keys: to be in a clique of n keys, a key must have at least n-1 signatures, so it’s easy to simplify the graph — if you find a clique with 20 keys, you can remove all keys that have less than 19 signatures). My first googling result pointed to Ashay Dharwadker’s solver implementation (which also proves P=NP ;). Googling further allowed me to find the solver provided with the DIMACS benchmarks. It’s clearly not the state of the art, but it was enough in my case (allowed to find the result almost immediately). The biggest clique contains 47 keys. However, it looks like someone had fun, and injected a lot of bogus keys in the keyring. See the clique. So I ignored those keys, and re-ran the solver. And guess what’s the size of the biggest “real” clique? Yes. 42. Here are the winners:

CF3401A9 Elmar Hoffmann
AF260AB1 Florian Zumbiehl
454C864C Moritz Lapp
E6AB2957 Tilman Koschnick
A0ED982D Christian Brueffer
5A35FD42 Christoph Ulrich Scholler
514B3E7C Florian Ernst
AB0CB8C0 Frank Mohr
797EBFAB Enrico Zini
A521F8B5 Manuel Zeise
57E19B02 Thomas Glanzmann
3096372C Michael Fladerer
E63CD6D6 Daniel Hess
A244C858 Torsten Marek
82FB4EAD Timo Weing rtner
1EEF26F4 Christoph Ulrich Scholler
AAE6022E Karlheinz Geyer
EA2D2C41 Mattia Dongili
FCC5040F Stephan Beyer
6B79D401 Giunchedi Filippo
74B11360 Frank Mohr
94C09C7F Peter Palfrader
2274C4DA Andreas Priesz
3B443922 Mathias Rachor
C54BD798 Helmut Grohne
9DE1EEB1 Marc Brockschmidt
41CF0322 Christoph Reeg
218D18D7 Robert Schiele
0DCB0431 Daniel Hess
B84EF12A Mathias Rachor
FD6A8D9D Andreas Madsack
67007C30 Bernd Paysan
9978AF86 Christoph Probst
BD8B050D Roland Rosenfeld
E3DB4EA7 Christian Barth
E263FCD4 Kurt Gramlich
0E6D09CE Mathias Rachor
2A623F72 Christoph Probst
E05C21AF Sebastian Inacker
5D64F870 Martin Zobel-Helas
248AEB73 Rene Engelhard
9C67CD96 Torsten Veller

It’s likely that this happened thanks to a very successful key signing party somewhere in germany (looking at the email addresses). [Update: It was the LinuxTag 2005 KSP.] It might be a nice challenge to beat that clique during next Debconf ;) And the biggest clique I’m in contains 23 keys. Not too bad.

30 August 2008

Andreas Barth: stale vs incomplete: xen vs kvm

Currently, I'm together with Marc Brockschmidt evaluating which virtualization to use on our new server. We want that our virtual systems feel like real systems, and we want an open source solution. So, vserver and the like is out of the game, as well as VMware. The two remaining solutions we looked at are Xen and kvm. Xen has of course the advantage of the matured. Also, we have experience with running xen servers - and with the issues that can happen, like the chances to disconnect dom0 from xend, and then reboot the server the hard way. However, the most serious disadvantage is that development has practically stalled with the 2.6.18-kernel. Of course, even of today one could install a new server based on Etch, but that doesn't really feel right. There is some development ongoing to run domUs with newer kernels (like in Lenny), but there isn't currently any new kernel available for dom0. kvm is a more recent addition to the virtualization camp, and is basically "qemu on steroids". All looks rather promising, development happens with the recent kernels. However, kvm lacks a few features of e.g. Xen. This includes the ability to reboot dom0 (and the hardware) and just let the domUs survive. Or to have a nice management script where one could just say "xm shutdown $domU", and have basically the power button be pressed on the virtual machine. Or to just attach and detach to the virtual console whenever one wants. Nothing of that is impossible with kvm, one could attach the command-terminal to some pipe, and the linux console to some other, and attach and distach via own scripts. But - all of that should be expected to be available from some solution that calls itself enterprise ready. (And - writing own scripts has always the possibility to make own mistakes.) However, among all the worst possible issue is that kvm is underdocumented (or rather: There are lots of different places where some parts of the documentation is hidden - including the great remark in the man page "The other options are similar to those of qemu."). So, what to do? Invest more time into a solution that seems like a dead end. Or put up with the incompletness of another solution?

25 April 2008

Joerg Jaspert: Some more merges and stuff

Following my first ftpmaster blog post, which got some nice replies, lets have the next, as I didn’t stop working on stuff. Not as much as I hoped, but well, something. So, what was done since then?

Merged a little patch from Christoph Berg fixing usage of a function - which doesn’t exist in global namespace, so it needs the full name.
Merged another patch from Christoph - now we write the content of the Changed-By field from .changes files into projectb. To cite him for the reason: This will allow to identify NMUs and sponsored uploads more precisely in tools querying projectb. Broke dinstall with this and as a little side-effect also the merkel copy of the projectb database. Ups. Well, fixed it immediately again. :) There is some old data available, which I could import into projectb, so it would have data for all packages. But there is some weirdness with the dataset I got from Christoph and am waiting for him to provide a solution. Anyway, all new uploads already store this data, so it is only old data missing, currently.
Merged a small patch from Thomas Viehmann into our archive software which enables it to send a copy of all mails to the person sponsoring the upload. This is in affect for all future uploads, not for any package which already hit the queues (like those in NEW right now). We do
- only sent the extra copy if it is a sourceful upload, to not spam buildd admins,
- only sent the mails to sponsors who did not disable their @debian.org address.
For those interested, the process works similar to:
- We look if the Debian login attached to the key signing the upload has an active @debian.org address.
- If that is the case we look for all uids on that key that look like an email address and compare them with the Maintainer: and Changed-By: field in the .changes file.
- If there is no match, ie. none of the “emails” of the key uids is in those fields, the @debian.org mail is added to the set of valid receivers.
Changed split-done to only move files older than 30 days and let it run by our weekly cronjob. Also run it once, which moved 120079 files into their $YEAR/$MONTH subdirectories. (split-done seperates the subdir where we store every .changes of uploads accepted. That quickly grows fast, and we all know how bad filesystems perform with such a number of files. The above 120079 files are from November/Dezember 2007 til now, so imagine how many we have in total, as we have them back to 2002).
Merged a patch I wrote 7 days ago that does a little sanity checking in the Provides field. Do not try to version your provide, that is forbidden and actually breaks software. (See this bug about it.)
Merged a patch, submitted by Marc Brockschmidt some long time ago, that lets cruft-report also display NBS for experimental. Changed the cron script to use that extra feature.

And as in my last blog post, lets add some items to the todo list:

Enable dak to read a Built-By field from the .changes file. Thats the sanest solution to sponsored uploads. And also better for buildds, they then should no longer overwrite the Maintainer field with the -m parameter when building packages. Exact semantics to be defined, but dak probably would go something similar to
- Built-By same as Changed-By or Maintainer?” - do not send mail there.
- Built-By differs from Changed-By or Maintainer and its sourceful? Send mail to it too.
- Built-By differs from Changed-By or Maintainer and binary-only? Only send mail there.
Track the Uploaders: field in projectb. That would help the BTS to offer another display.

Having written all this - today (AKA Saturday, when writing this) won’t see code merges by me. I think most of my ftpmaster related work today will be fun with the NEW queue. For some unknown reason people keep uploading packages. Tsss. Comments: 3

17 March 2008

Pierre Habouzit: [RC-Bug-A-Day] Day 3

Since there aren't only serious grave and critical bugs we have to fix for the release, the RM Team spontaneously decided to make a front attack to the g++-4.3 FTBFSes. All in all we did probably more than 110 to 120 NMU to fix g++-4.3 FTBFS bugs. Our champion is definitely Marc Brockschmidt who uploaded 50 packages. I NMUed 22, and also did one QA upload. Add to that 5 sponsored NMUs for Cyril Brulebois, and one for Arthur Loiret who happens to be my NM, and fixed one of those bugs for his T&S. There are still twice as many to fix, those are really easy bugs to deal with, just give it a try ! I'd like to thank Cyril Brulebois for his awesome work on the g++-4.3 release goal. A shame that he's still elmo-ed, as he wrote more than 150 patches, and could have uploaded them himself if not still waiting for his account. Shame on us.

3 February 2008

Christian Perrier: No more badly encoded debian/control

In today's annoucement of the release update, Marc Brockschmidt mentioned the release goal to make UTF-8-clean control and changelog files. I was also working on Pootle for i18n.debian.net and such non clean files for linux-doc-html-pt and linux-doc-text-pt were annoying me. So, I just decided to NMU these packages and fix the 20th century ISO-8859-1 encoding they use in debian/control. As this was easy, I decided to NMU (0-day....all these packages had longstanding bug reports while the fix is trivial) all other packages with badly encoded debian/control files and, as of tonight, there should not be such files anymore in unstable. Maintainers with badly encoded debian/changelog, watch out for your asses, I could decide for such a campaign for your stuff. So, please fix this: this is as easy as using "iconv".

14 March 2007

Nico Golde: New Maintainer

Today I got accepted as Debian Developer. I would like to thank the following people (in random order) for their involvment in my NM, mentoring and sponsoring uploads.

Philipp Kern, Michael Schiansky, Daniel Baumann, H ctor Garc a, Norbert Tretkowski, Werner Heuser, Marc Brockschmidt, Lars Wirzenius, Christoph Berg, J rg Jaspert, James Troup, Martin Krafft (feel yourself added if I forgot you, hopefully not)

Oh and congratulations Holger.

6 January 2007

Bernhard R. Link: clean vs. crowded bug pages

Marc Brockschmidt wrote the BTS is too crowded and Joey Hess objected that a too clean BTS can also be a bad sign.

I think both is true or to say better none of the ways makes sense without the other:

Bug reports are in my eyes one of the most valueable resources we have. Noone can test everything even in almost trivial packages. To archive quality we need the users input and a badly worded bug report is still better than no bug report at all. Our BTS is a very successfull tool in that as it lowers the barrier to report issues. No hassles to create (and wait for completion of) an account, no regrets by getting funny unparseable mails about some developer changing their e-mail addresses (did I already say I hate bugzilla?).

As those reports are valueable information, one should keep them as long as they can be usefull. Looking at the description of the wontfix tag shows that even a request that cannot be or should not be fixed in the current context is considered valueable. Most programs and designs change, and having a place to document workarounds and keep in memory what open problems exist.

On the other hand a crowded bug list is like a fridge you only put food into. Over time it will start to degrade into the most displeasing form of a compost heap. The same holds for bug reports:

Most bugs are easier when they are young: You most propably have the same version as the submitter somewhere, know what changed recently and when you can reproduce it you get some hints on what is happening and get add it. If you cannot reproduce it, the submitter might still be reachable for more information.

When the report is old, things get harder. Is the bug still present? Was it fixed in between by some upstream release? Is the submitter still reachable and does still remember what happened?

When I care enough of a problem to write a bug report and trying to supply a patch for it, I try to always take a look at the bug list and look for some other low hanging fruits to pick and submit some other patch, too. (After all, most of the time is spend trying to understand the package and the strange build system upstream choose instead of plain old autotools and not when fix the problem). But when it is hard to see the trees because of all the dead wood around it, and there is nothing to find with some way to reproduce it and one knows far too well that the most efficient steps would be a tedious search for old versions to see if that was a bug solved upstream many years ago, good intentions tend to melt like ice thrown in lava.

So, when I wrote both is true I meant that keeping real-world issues documented and aware is a good thing. But having bugs rot (and often they do), will pervert all the advantages. In the worst case, people will even stop submitting new reports as it takes to long to look at all the old ones to look for a dublicate.

2 August 2006

Martin Zobel-Helas: For those who wondered about proposed-updates

In collaborative work of Alexander Wirt, Marc Brockschmidt, Julien Danjou and me there is now a public list of packages which are in proposed-updates and await moderation currently. The backlog in that queue is currently decreasing drastically, thanks to AJ. Builds of the new D-I for Sarge r3 should start as soon as all needed packages are now spread to all mirrors. Update: Ooops, perhaps i should also add the link to that page.