Search Results: "Florian Ragwitz"

8 May 2009

Florian Ragwitz: MongoDB on CPAN

I've been doing some contracting work for 10gen recently. They have that rather cool open source document database called MongoDB and they wanted me to write a module to use that from Perl. I did that and the code is now available on CPAN and github. Writing that was fun, and I'm already looking forward to be able to use MongoDB as a backend for KiokuDB. I started writing code for that and put it on github, but isn't passing all the tests just yet. In related news, after finishing the MongoDB module, I'm available for other things again. So if you're looking for a Perl telecommuter, let me know.

29 April 2009

Florian Ragwitz: Running tests that require an X server

Lots of CPAN distributions require some kind of graphical environment. Some of them even pop up windows, which not only very annoying, but also sometimes fails if you're using a tiled window manager. To test such distributions on a machine where no graphical environment is available or on your desktop while you're working and don't want to get annoyed to death you can use a fake X server, like Xvfb. The easiest way to do that is to run
$ xvfb-run -a make test
instead of a plain make test. That'll automatically create a fake xserver, set up DISPLAY and run make test in that environment. That works well for manually installing modules. When installing using CPAN.pm you can make things easier by writing a distropref. First, tell cpan where your distroprefs are. I use ~/.cpan/prefs:
$ cpan
cpan[1]> o conf init prefs_dir
[...]
<prefs_dir>
Directory where to store default options/environment/dialogs for
building modules that need some customization? [] /home/rafl/.cpan/prefs
cpan[3]> o conf commit
commit: wrote '/home/rafl/.cpan/CPAN/MyConfig.pm'
Now write a distropref for the modules that need an X server and put it into your prefs dir as X11.yml
---
match:
  distribution:  
    /(?x:Wx
       Gtk2
       Gnome2
       ... other modules requiring an X server
    )-\d 
test:
  commandline: "xvfb-run -a make test"
Now the tests for Wx, Gtk2, Gnome2 and all other distributions you list in that regex will be executed with a fake X server. I have yet to figure out how to write a distropref that just prepends to the test commandline instead of replacing it so I won't need to have another pref for all modules using Module::Build.

28 April 2009

Florian Ragwitz: Implementing Typed Lexical Variables

For quite some time perl provided a form of my declarations that includes a type name, like this:
my Str $x = 'foo';
However, that didn't do anything useful, until Vincent Pit came along and wrote the excellent Lexical::Types module, which allows you to extend the semantics of typed lexicals and actually make them do something useful. For that, it simply invokes a callback for every my declaration with a type in the scopes it is loaded. Within that callback you get the variable that is being declared as well as the name of the type used in the declaration. We also have Moose type constraints and the great MooseX::Types module, that allows us to define our own type libraries and import the type constraints into other modules. Let's glue those modules together. Consider this code:
use MooseX::Types::Moose qw/Int/;
use Lexical::Types;
my Int $x = 42;
The first problem is that the perl compiler expects a package with the name of the type used in my to exist. If there's no such package compilation will fail. Creating top-level namespaces for all the types we want to use would obviously suck. Luckily the compiler will also try to look for a function with the name of the type in the current scope. If that exists and is inlineable, it will call that function and use the return value as a package name. In the above code snippet an Int function already exists. We imported that from MooseX::Types::Moose. Unfortunately it isn't inlineable. Even if it were, compilation would still fail, because it would return a Moose::Meta::TypeConstraint instead of a valid package name. To fix that, let's rewrite the code to this:
use MooseX::Types::Moose qw/Int/;
use MooseX::Lexical::Types qw/Int/;
my Int $x = 42;
Let's also write a MooseX::Lexical::Types module that replaces existing imported type exports with something that can be inlined and returns an existing package name based on the type constraint's name.
package MooseX::Lexical::Types;
use Class::MOP;
use MooseX::Types::Util qw/has_available_type_export/;
use namespace::autoclean;
sub import  
    my ($class, @args) = @_;
    my $caller = caller();
    my $meta = Class::MOP::class_of($caller)   Class::MOP::Class->initialize($caller);
    for my $type_name (@args)  
        # get the type constraint by introspecting the caller
        my $type_constraint = has_available_type_export($caller, $type_name);
        my $package = 'MooseX::Lexical::Types::TYPE::' . $type_constraint->name;
        Class::MOP::Class->create($package);
        $meta->add_package_symbol('&'.$type_name => sub ()   $package  );
     
    Lexical::Types->import; # enable Lexical::Types for the caller
 
1;
With that the example code now compiles. Unfortunately it breaks every other usecase of MooseX::Types. The export will still need to return a Moose::Meta::TypeConstraint at run time so this will continue to work:
has some_attribute => (is => 'ro', isa => Int);
So instead of returning a plain package name from our exported function we will return an object that delegates all method calls to the actual type constraint, but evaluates to our special package name when used as a string:
my $decorator = MooseX::Lexical::Types::TypeDecorator->new($type_constraint);
$meta->add_package_symbol('&'.$type_name => sub ()   $decorator  );
and:
package MooseX::Lexical::Types::TypeDecorator;
use Moose;
use namespace::autoclean;
# MooseX::Types happens to already have a class that doesn't do much
# more than delegating to a real type constraint!
extends 'MooseX::Types::TypeDecorator';
use overload '""' => sub  
    'MooseX::Lexical::Types::TYPE::' . $_[0]->__type_constraint->name
 ;
1;
Now we're able to use Int as usual and have Lexical::Types invoke its callback on MooseX::Lexical::Types::TYPE::Int. Within that callback we will need the real type constraint again, but as it is invoked as a class method with no good way to pass in additional arguments, we will need to store the type constraint somewhere. I choose to simply add a method to the type class we create when constructing our export. After that, all we need is to implement our Lexical::Types callback. We will put that in a class all our type classes will inherit from:
Class::MOP::Class->create(
    $package => (
        superclasses => ['MooseX::Lexical::Types::TypedScalar'],
        methods      =>  
            get_type_constraint => sub   $type_constraint  ,
         ,
    ),
);
The Lexical::Types callback will now need to tie things together by modifying the declared variable so it will automatically validate values against the type constraint when being assigned to. There are several ways of doing this. Using tie on the declared variable would probable be the easiest thing to do. However, I decided to use Variable::Magic (also written by Vincent Pit - did I mention he's awesome?), because it's mostly invisible at the perl level and also performs rather well (not that it'd matter, given that validation itself is relatively slow):
package MooseX::Lexical::Types::TypedScalar;
use Carp qw/confess/;
use Variable::Magic qw/wizard cast/;
use namespace::autoclean;
my $wiz = wizard
    # store the type constraint in the data attached to the magic
    data => sub   $_[1]->get_type_constraint  ,
    # when assigning to the variable, fail if we can't validate the
    # new value ($_[0]) against the type constraint ($_[1])
    set  => sub  
        if (defined (my $msg = $_[1]->validate($  $_[0]  )))  
            confess $msg;
         
        ();
     ;
sub TYPEDSCALAR  
    # cast $wiz on the variable in $_[1]. pass the type package name
    # in $_[0] to the wizard's data construction callback.
    cast $_[1], $wiz, $_[0];
    ();
 
1;
With this, our example code now works. If someone wants to assign, say, 'foo' to the variable declared as my Int $x our magic callback will be invoked, try to validate the value against the type constraint and fail loudly. WIN! The code for all this is available github and should also be on CPAN shortly. You might notice warnings about mismatching prototypes. Those are caused by Class::MOP and fixed in the git version of it, so they'll go away with the next release. There's still a couple of caveats, but please see the documentation for that.

26 April 2009

Florian Ragwitz: Declaring Catalyst Actions

For a long time the Catalyst Framework has been using code attributes to allow users to declare actions that certain URLs get dispatched to. That looks something like this:
sub base    : Chained('/')    PathPart('') CaptureArgs(0)   ...  
sub index   : Chained('base') PathPart('') Args(0)          ...  
sub default : Chained('base') PathPart('') Args             ...  
It's a nice and clean syntax that keeps all important information right next to the method it belongs to. However, attributes in perl have a couple of limitations. For one, the interface the perl core provides to use them is horrible and doesn't provide nearly enough information to do a lot of things, but most importantly attributes are just plain strings. That means you will need to parse something like "Chained('base')" into (Chained => 'base') yourself to make proper use of them. While that's easy for the above example, it can be very hard in the general case because only perl can parse Perl. It's one of the reasons you can't use Catalyst::Controller::ActionRole to apply parameterized roles to your action instances, because parsing parameters out of things like Does(SomeRole => names => [qw/affe tiger/], answer_re => qr/42/ ) would be awful and wrong. With Catalyst 5.8 most of the attribute related code has been removed from the internals. It's now using MooseX::MethodAttributes to do all the heavy lifting. Also the internals of how actions are registered have been refactored to make it easier to implement alternate ways without changing the Catalyst core. As a proof of concept for this I implemented a new way of declaring actions that's very similar to how Moose provides it's sugar functions. You can get it from github. With that, the above example looks like this:
action base    => (Chained => '/',    PathPart => '', CaptureArgs => 0) => sub   ...  ;
action index   => (Chained => 'base', PathPart => '', Args    => 0    ) => sub   ...  ;
action default => (Chained => 'base', PathPart => '', Args    => undef) => sub   ...  ;
It also moves method declaration from compiletime to runtime, making this possible:
for my $action (qw/foo bar baz/)  
    action $action => (Chained => 'somewhere', Args => 0) => sub  
        my ($self, $ctx) = @_;
        $ctx->stash->  $action   = $ctx->model('Foo')->get_stuff($action);
     ;
 
Admittedly, that's all very ugly, but illustrates well what kind of things we're able to do now. But it doesn't need to be ugly. With Devel::Declare we have a great tool to add our own awesome syntax to perl, similar to how things like MooseX::Method::Signatures, MooseX::MultiMethods and MooseX::Declare do. So how would a declarative syntax for Catalyst controllers look like? I don't know. Ideas include something like this:
under /some/where, action foo ('foo', $id)   ...  
to mean:
sub foo : Chained('/some/where') PathPart('foo') CaptureArgs(1)   ...  
Adding Moose type constraints to this would be interesting, too, and make validation of captures and arguments a lot easier. Multi dispatch similar to MooseX::MultiMethods could be handy as well:
under /some/where  
    action ('foo', Int $id)  
        # find and stash an item by id
     
    action ('foo', Str $name)  
        # search items using $name
     
    action ('foo', Any $thing)  
        # display error page
     
 
So you see there are a lot of possibilities that should be explored. Unfortunately I have no idea what kind of syntax and features people would like to have, so your feedback on this would be much appreciated. :-)

19 August 2007

Alexis Sukrieh: The road to libdevel-repl-perl, part 2

Thanks to Florian Ragwitz, who packaged libpadwalker-perl 1.5-1, there is no blocker anymore that prevents libdevel-repl-perl from entering sid: By the way, the author of Devel::REPL, Matt S Trout, looks pretty happy to see his module entering Debian. I’ve just uploaded libdevel-repl-perl, this upload closes the exciting work session we did during all the weekend with Damyan Ivanov, in order to get the module into debian. All its dependencies are now in the Perl group’s hands. That was fun. Team maintenance rocks!

16 August 2007

Alexis Sukrieh: Perl Console 0.2 Debian package

The first version of the debian package of Perl Console has been uploaded to the NEW queue. For those who are waiting for it, I’ve also uploaded the package here. Thanks to the patch sent by Antonio Terceiro, the version 0.3 will be properly packaged ala Perl (namely with the famous Makefile.PL, MANIFEST and friends). I plan to adress the multi-line issue for 0.3 (mainly handling code with loops or conditional structures), as Florian Ragwitz underlined, it could be worth using Devel::REPL instead of rewriting the wheel.

24 June 2007

Dirk Eddelbuettel: New OpenMPI packages

Debian had OpenMPI package since early last year when Florian Ragwitz made some initial stabs at packaging. The package has seen a number of NMU and patches since then, but was generally getting cobwebs ... which was too bad because OpenMPI seems to have some wind behind its sails upstream. Unfortunately, little of that got packaged for Debian. After some discussions on and around the debian-science list, a new maintainer group was formed on Alioth under the pkg-openmpi name. Tilman Koschnick (who had already helped Florian with patches), Manuel Prinz, Sylvestre Ledru and myself have gotten things in good enough shape in reasonably short time. And I have just uploaded a lintian-clean package set openmpi_1.2.3-0 to Debian, where it is expected to sit in the NEW queue for a little bit before moving on to the archive proper. The changelog entry (which will appear here eventually) shows twelve bugs closed. Our plan is to provide a stable and well maintained MPI implementation for Debian. OpenMPI is the designated successor to LAM, and apart from MPICH2, everybody seems to have thrown their weight behind OpenMPI. So we will try to work with the other MPI maintainers to come up with sensible setups, alternatives priorities and the likes. If you are interested in MPI and would like to help, come join us at the Alioth project pkg-openmpi. Last but not least, thanks to Florian for the initial packaging, and to Clint Adams, Mark Hymers, Andreas Barth, and Steve Langasek (twice even) for NMUs.

22 October 2006

Florian Ragwitz: Audio::XMMSClient::XMLRPC

I just uploaded Audio::XMMSClient::XMLRPC 0.01 to the CPAN. It's basically an XMLRPC interface to the xmms2 daemon. It ties all Audio::XMMSClient functions except for the signal and broadcast stuff to an XMLRPC API which is quite similar to the Perl API of Audio::XMMSClient. It will be available from from CPAN in some hours and can be grabbed from the Pause incoming queue in the meantime.

15 October 2006

Florian Ragwitz: Random seeking on gzip streams

After asking the lazyweb on how to seek on gzip streams I got a very useful and comprehensive reply by Paul Sladen. As my previous searches on this topic didn't turn up to much useful things I'd like to publish Pauls reply (with his approval) here to help other people having the same problem as well:
Paul Sladen:
Seeking on compressed streams is a little like reading sectors from a harddisk; to read one byte requires reading a much large sectors. Two things are needed, the sector size and the sector length. On a hard-disk the sector length is nominally 512 with the mapping between an offset and sector start often as simple as: $offset >> 9. In compressed streams, the relationship between stream position and sector start is non-linear. Using a non-linear mapping is more involved as searching a lookup table is required. Mapping between an uncompressed position instead cannot be calculated and a 'zmap' table listing all sector start position and corresponding on-disk points needs to be created. The size of a sector must be determined aswell. Bzip2: It is fairly easy to seek on compressed bzip2 streams. In a Bzip2 stream, each block is totally separate and self contained---have a look a the 'bzip2recover' program. 'bzip2recover' scans through a bzip2 stream locating each block/'segment' (normally 900kB of input, so ~200kB of compressed output), then copying each segment to separate bzip2 file. Gzip: Seeking on gzip streams is somewhat more involved; Gzip uses a dictionary design where back-references are made to uncompressed data within the last 32kB ("the window"). The only safe place to start reading a 'sector' in a gzip stream is somewhere where that the dictionary size is zero. An empty dictionary occurs at the start of a stream, or where the stream has been 'reset'. Gzip resets: A gzip 'sector' (the length required to be read, to safely decode a byte) could be the full length of the file-size, or could occur more frequently. A gzip sector start occurs when the stream is reset---resetting the stream more frequently leads to more sector starts, but a reduction in compression thanks to less use of the sliding dictionary. Trade-off. The "gzip --rsyncable" is an example of stream resetting; in this case the stream is reset (and the dictionary closed) each time that the sum of the last 4096 octets of input data, modulo 65536(?) equals a magic number (zero normally). In exchange for the extra reset/start points, you get a size increase of 4-5%, but a better "hit rate" for random access. Gzip reset point lookup tables: More likely you'd want to reset the stream every eg. 4096kB of input, then save the corresponding position of the output stream for a give input position. This lookup table of start positions needs to be stored somewhere. 'squashfs' does in a separate file; 'zsync' in the zmap. (If you wanted to define a standard, these reset points could be stored in the 'extra' or 'comment' fields at the start of a gzip file; storing them with a suitable header would enable some degree of seeking for sufficiently intelligent readers---files could be post-processed to add this data). Storing a LUT takes space; adding extra reset points and storing the LUT data for these extra points takes more space---but I found storing the LUT as a set of double-deltas cut it down; could be gzipped aswell. Gzip partial resets: A gzip stream doesn't just contain 'reset' points, but also 'partial reset' points, a partial resets the huffman tables, but doesn't reset the dictionary. It's safe to stop reading at a partial reset (which effectively shortens the sector length). Whilst not normally possible, partial reset points /are/ able to be used as a starting point /if/ you can populate the previous uncompressed 32kB that forms the dictionary window from another source---'zsync' does this when reconstructing a file, however 'zsync' only does construction in a linear fashion, from the start of a file. Further Reading: Magic words to search for: zsync, squashfs, apt-sync, succinct Succinct is my unfinished project. Succinct involves many parts, one of which covers what you're doing. It might be good to break separate off seeking of compressed streams into a separate project---many small pieces make big problems easier! What I might do is store the LUT in the EXTRA field of the gzip header (limited to 64kB though). Though, within 64kB it would always be possible to construct a useful LUT; based on the final length of the input data, it would be possible to reduce the granularity of the LUT such that it only mentioned 50%, 30%... even 5% of reset points. The LUT itself could be gzipped for extra gain. Hope that's useful, -Paul PS. In all the cases (except one) that I mentioned gzip, I probably meant 'zlib' or 'deflate'! :)
In a later mail Paul also added the following:
Paul Sladen:
BTW, in my search around a bit later for something I turned up two more things that might be interesting: dictzip, examples/zran.c in the zlib source 'dictzip' is a similar idea to what I suggested at the end of the previous email. A lookup table is built and stored in the EXTRA section at the start of the gzip file and the utilities ('dictzcat' et al) then use this look-up table for random-access on the data. 'dictzip' are still slightly wide of the mark, forcing a flush/chunk every 58969 bytes. That number is chosen so that even in the worst case, the compressed version will never be more than 216 in length. However I think their math is slightly wrong as even in the worse case, gzip only generates 5 bytes per 32kB for an uncompresed store. -> ~65514 by my reckoning. Squashfs uncompressable chunks by setting the top bit of the file offset to indicate a true store. Their lookup table format appears to be flexible enough that I /might/ be able to create a table by scanning an existing stream rather than re-encoding. Depends about the 64kB limitation. Mmm.
Thanks Paul!

13 October 2006

Florian Ragwitz: Plat_Forms: The web development platform comparison

http://www.plat-forms.org/index.htm
Plat_Forms is an international programming contest. It aims at comparing different technological platforms for developing web-based applications: Java EE, .NET, PHP, Perl, Python, Ruby-on-Rails
I really love the idea behind that project and after some Perl people complained they now also accept submissions from Perl developers. So if you live in or around Germany and would like to do some propaganda for your favorite web framework, apply right now, as long as your favorite web framework is based on Perl. :-)

Florian Ragwitz: Seeking on gzip streams?

Dear Lazyweb, do you know anything that implements seeking on gzip/zlib streams? Seeking forward is easily done by reading the data up to the requested position, which isn't nice, but works well. Seeking backwards may be implemented by something like this:
inflateEnd (&stream);
fseek (input, 0, SEEK_SET);
inflateInit (&stream);
/* read up to the requested position */
This will be terribly slow for some cases, but well.. I'm just wondering how seeking with SEEK_END could be implemented without inflating the whole stream at first.

30 September 2006

Florian Ragwitz: xmms2_0.2DrGonzo-1

The xmms2 project released 0.2DrGonzo recently. I've prepared packages for it, but as the new release adds two new plugins I needed to add two new binary packages. Therefor it'll need to go through the NEW queue again. In the meantime you can grab the source package and build it yourself for your architecture:

Florian Ragwitz: Perldition, a small Blog and CMS, written in Perl

Until a few days ago my website was driven by PodCMS, which allowed me to manage all of the content as directories and files containing Pod (Plain Old Documentation). Unfortunately that wasn't quite flexible enough and didn't allow some features, like comments, tags and trackbacks, to be implemented easily. Also Pod sucks for some sort of content, as there's no satisfying Pod2Html module on CPAN as it seems. Therefor I decided to create something new. The new system has all features the old one had, but now allows to create content in lots of formats such as: Other markup formats are possible as well, as the API for the formatting plugins is quite easy and usually just a thin wrapper around a CPAN module which does the actual translation to HTML. Beside allowing new formats to write the content in, it also adds the following features: In conclusion I'm pretty happy with the new software. I'm just very disappointed by quality of the generated HTML that the various Pod2HTML modules on CPAN produce, so I'll probably end up in writing something myself, based on Pod::Parser. PS: The URL to rss feed changed. Please use http://perldition.org/blog.rss.

26 September 2006

Florian Ragwitz: My most often executed shell commands

$ history 1 awk ' print $2 ' awk 'BEGIN  FS=" "   print $1 ' sort uniq -c   sort -r  head -10
On my laptop:
   1557 vi
    710 perl
    692 man
    661 cd
    648 ac
    513 sudo
    510 grep
    499 rm
    388 ls
    180 wajig
On weedy.perldition.org, my server:
   1967 sudo
   1052 vi
    810 cd
    766 l
    352 screen
    316 perl
    250 ..
    213 wajig
    212 find
    194 rm
.. expands to cd .., l is shorthand for ls and ac is my alias for apt-cache.

Florian Ragwitz: My most often executed shell commands

$ history 1 awk ' print $2 ' awk 'BEGIN  FS=" "   print $1 ' \
   sort uniq -c   sort -r  head -10
On my laptop:
   1557 vi
    710 perl
    692 man
    661 cd
    648 ac
    513 sudo
    510 grep
    499 rm
    388 ls
    180 wajig
On weedy.perldition.org, my server:
   1967 sudo
   1052 vi
    810 cd
    766 l
    352 screen
    316 perl
    250 ..
    213 wajig
    212 find
    194 rm
.. expands to cd .., l is shorthand for ls and ac is my alias for apt-cache.

4 September 2006

Florian Ragwitz: Linux::Sysfs and ExtUtils::Autoconf

I recently wrote two two new Perl modules. The first is called Linux::Sysfs and is a library binding for libsysfs. It offers a convenient interface to sysfs (/sys). After releasing the first version of it for peer review and, later on, to CPAN, I received a lot of reports about test failures because of a missing or outdated libsysfs. Therefor I wished to have something like GNU autoconf for Perl modules. It would be possible to write something similar in pure Perl (someone actually tried it: Config::Autoconf), but compiling C programs from Perl in a portable way isn't much fun. Therefor I decided to write some Perl glue around autoconf and autoheader so those tools are easy to use from Makefile.PLs and such. I did so and the result is ExtUtils::Autoconf. It'll be on CPAN within the next few hours. Until then the current version is available from here. It currently has some documentation about using it from an ExtUtils::MakeMaker-based Makefile.PL. Instructions about how to use it with Module::Build and Module::Install are still missing. The Module::Install part will be done by an extension called Module::Install::Autoconf, which I'm currently about to write. For the Module::Build part I don't really have an Idea on how to do that, so any help from some Module::Build people would be very appreciated.

11 June 2006

Florian Ragwitz: Class::DBI is dead

David, you wrote about Class::DBI performance. Even without benchmarking you could have found out, that Class::DBI is quite slow. Fortunately there are some alternative object-relational mappers available in the perl universe. The best ones I found so far are Rose::DB::Object and DBIx::Class. I took a closer look at both and would like to share my experience. As this shows, RDBO is faster than DBIC in most of the case. The generated SQL doesn't differ too much and therefor it must be the perl side of things that makes the difference. Matt S Trout <dbix-class@trout.me.uk> says:
  However, RDBO achieves its perl speed by aggressive inlining of stuff
  etc. - for example the main object retrieval function in RDBO's manager
  class is >3000 lines in a single sub. DBIC values extensibility over a
  few extra sub calls, so methods are much more broken out and there are
  many more ways to hook into the DBIC execution process to extend.
Also its idea of resultsets is something I really love. Here's a small example to illustrate that:
  my $user_rs = $schema->resultset('User')->search(  registered => 1  );
  $user_rs    = $user_rs->search(
		    comment.title => 'Foo'  ,
		   
		  	join     =>   'article' => 'comment'  ,
			order_by => 'user.name',
		   ,
  );
  # no sql executed yet.
  # now you can use your resultset as an iterator or query a list of User
  # objects from it or ..
  while (my $user = $user_rs->next)  
	  ...
   
  # or
  my $count = $user_rs->count;
Using these resultset makes it extremely easy to built up queries piece by piece and to work together well with, for example, a templating system. You don't need to fetch all row-objects and give them to your template. You can just pass the iterator to the template library. There's a lot more to say about this two object-relational mappers (for example RDBO supports prefetching of multiple one-to-many at once, which DBIC doesn't), but maybe you just should take a look yourself. I personally prefer DBIx::Class for its vast extensibility.

10 April 2006

Florian Ragwitz: What package is eating up my disk space?

Enrico, how about dpigs(1) from the debian-goodies package to solve your disk usage problem? It's written in #!/bin/sh and not that nifty, but, if you omit the option parsing and usage code, it's even shorter. It basically does
  grep-status -nsInstalled-size,Package -F Status ' installed' $STATUS \
    perl -p00l12 -e 's/\n/ /' \
    sort -rn \
    head --lines=$LINES

Enrico Zini: What package is eating up my disk space?

My 5Gb /usr partition is full. What do I have installed that's eating up all the space? Let's see:
#!/usr/bin/ruby
# pkgsizestat - Display the installed size of packages in a filesystem
#
# Copyright (C) 2006  Enrico Zini <enrico@debian.org>
#
# This program is free software; you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation; either version 2 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program; if not, write to the Free Software
# Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
# Display the installed size of packages in the given filesystem
# Defaults to /usr if non specified
#
# Usually used as "./pkgsizestat /usr   sort -nr   less" to see what packages
# are filling up your /usr partition
dev = File.stat(ARGV[0]   "/usr").dev
def pkgsize(name, dev)
      size = 0
      IO.foreach(name)    line 
              begin
                      st = File.stat(line.chomp)
                      if (st.file? && st.dev == dev)
                              size += st.size
                      end
              rescue
              end
       
      return size
end
Dir.glob("/var/lib/dpkg/info/*.list").each    file 
      puts "%d %s" % [pkgsize(file, dev), file.gsub(/.+?\/([^\/]+)\.list/, '\1')]
 
Neat little useful ruby script. Ruby is nice in making scripts short clean and compact. Now I need a shorter version of the GPL :) Update: Florian Ragwitz suggests to use dpigs(1) from debian-goodies instead. What my script does that dpigs doesn't do, however, is counting only those files provided by the packages that reside in the given partition. I could for example use my script to see what's filling up the root ('/') partition when /usr is mounted elsewhere, and I find out that the top package is not openclipart-svg, but linux-image-2.6.15-1-686. Update: htom sent an updated version to sum all sizes and show only up to a certain size:
dev = File.stat(ARGV[0]   "/usr").dev
def pkgsize(name, dev)
      size = 0
      IO.foreach(name)    line 
              begin
                      st = File.stat(line.chomp)
                      if (st.file? && st.dev == dev)
                              size += st.size
                      end
              rescue
              end
       
      return size
end
pkgs =  
Dir.glob("/var/lib/dpkg/info/*.list").each    file 
  pkgs[pkgsize(file, dev)] = file.gsub(/.+?\/([^\/]+)\.list/, '\1')
 
pkgs = pkgs.sort
pkgs.reverse!
to_size = 1024**3 # show up to 1 GB
size = 0
pkgs.each do  a 
  size += a[0]
  puts "%d %d %s" % [a[0], size, a[1]]
  break if size >= to_size
end
Ralph Amissah posted a different variant:
# [License part omitted]
dev=File.stat(ARGV[0]   "/usr").dev
def pkgsize(name, dev)
  size=0
  IO.foreach(name) do  line 
    begin
      st=File.stat(line.chomp)
      if (st.file? && st.dev == dev)
        size += st.size
      end
    rescue
    end
  end
  return size
end
def space(file,dev)
  "%d %s" % [pkgsize(file,dev),file.gsub(/.+?\/([^\/]+)\.list/,'\1')]
end
@used=Array.new
Dir.glob("/var/lib/dpkg/info/*.list").sort.each do  file 
  x=Array.new
  x << space(file,dev).split(/\s+/)
  p [x[0][0].to_i,x[0][1]]
  @used << [x[0][0].to_i,x[0][1]]
end
#p @used.sort.each    x  p x  
@used.sort.each    x  puts "# x[0]  # x[1] "  
#redirect to file?
Thank you everyone for the nice feedback!

1 April 2006

Florian Ragwitz: Audio::XMMSClient

Yesterday I did some work for the XMMS2 Debian packages. Thereby I noticed that libxmmsclient0, the xmms2 client library, that allows you to write your own xmms2 clients, is bound to quite a lot of languages, but perl bindings were still missing. I took a look at the library interface and it seemed that creating bindings shouldn't be a big deal. Therefor I gave a try and wrote Audio::XMMSClient. There are still some rough edges, but it works pretty well already. The interface is quite close to the C API, so it's possible to work with it even if documentation, examples and a test suite are still missing. Comments on the API, the namespace and whether this module should go into the xmms2 distribution, like other language bindings, or directly to CPAN are most welcome.

Next.