Search Results: "Richard Braakman"

24 March 2013

Lars Wirzenius: Two new t-shirt designs

I've made two new designs for Trunk Tees, my Cafepress store. Thank you to Richard Braakman for suggesting the .* one. Here's all the older designs as well:

8 June 2012

Lars Wirzenius: Obnam 1.0 (backup software); a story in many words

tl;dr: Version 1.0 of Obnam, my snapshotting, de-duplicating, encrypting backup program is released. See the end of this announcement for the details. Where we see the hero in his formative years; parental influence From the very beginning, my computing life has involved backups. In 1984, when I was 14, my father was an independent telecommunications consultant, which meant he needed a personal computer for writing reports. He bought a Luxor ABC-802, a Swedish computer with a Z80 microprocessor and two floppy drives. My father also taught me how to use it. When I needed to save files, he gave me not one, but two floppies, and explained that I should store my files one one, and then copy them to the other one every now and then. Later on, over the years, I've made backups from a hard disk (30 megabytes!) to a stack of floppies, to a tape drive installed into a floppy interface (400 megabytes!), to a DAT drive, and various other media. It was always a bit tedious. The start of the quest; lengthy justification for NIH In 2004, I decided to do a full backup, by burning a copy of all my files onto CD-R disks. It took me most of the day. Afterwards, I sat admiring the large stack of disks, and realized that I would not ever do that again. I'm too lazy for that. That I had done it once was an aberration in the space-time continuum. Switching to DVD-Rs instead CD-Rs would reduce to the number of disks to burn, but not enough: it would still take a stack of them. I needed something much better. I had a little experience with tape drives, and that was enough to convince me that I didn't want them. Tape drives are expensive hardware, and the tapes also cost money. If the drive goes bad, you have to get a compatible one, or all your backups are toast. The price per gigabyte was coming down fast for hard drives, and it was clear that they were about to be very competitive with tapes for price. I looked for backup programs that I could use for disk based backups. rsync, of course, was the obvious choice, but there were others. I ended up doing what many geeks do: I wrote my own wrapper around rsync. There's hundred, possibly thousands, of such wrappers around the Internet. I also got the idea that doing a startup to provide online backup space would be a really cool thing. However, I didn't really do anything about that until 2007. More on that later. The rsync wrapper script I wrote used hardlinked directory trees to provide a backup history, though not in the smart way that backuppc does it. The hardlinks were wonderful, because they were cheap, and provided de-duplication. They were also quite cumbersome, when I needed to move my backups to a new disk the first time. It turned out that a lot of tools deal very badly with directory trees with large numbers of hardlinks. I also decided I wanted encrypted backups. This led me to find duplicity, which is a nice program that does encrypted backups, but I had issues with some of its limitations. To fix those limitations, I would have had to re-design and possibly re-implement the entire program. The biggest limitation was that it treated backups as full backup, plus a sequence of incremental backups, which were deltas against the previous backup. Delta based incrementals make sense for tape drives. You run a full backup once, then incremental deltas for every day. When enough time has passed since the full backup, you do a new full backup, and then future incrementals are based on that. Repeat forever. I decided that this makes no sense for disk based backups. If I already have backed up a file, there's no point in making me backup it again, since it's already there on the same hard disk. It makes even less sense for online backups, since doing a new full backup would require me to transmit all the data all over again, even though it's already on the server. The first battle I could not find a program that did what I wanted to do, and like every good NIHolic, I started writing my own. After various aborted attempts, I started for real in 2006. Here is the first commit message:
revno: 1
committer: Lars Wirzenius <liw@iki.fi>
branch nick: wibbr
timestamp: Wed 2006-09-06 18:35:52 +0300
message:
  Initial commit.
wibbr was the placeholder name for Obnam until we came up with something better. We was myself and Richard Braakman, who was going to be doing the backup startup with me. We eventually founded the company near the end of 2006, and started doing business in 2007. However, we did not do very much business, and ran out of money in September 2007. We ended the backup startup experiment. That's when I took a job with Canonical, and Obnam became a hobby project of mine: I still wanted a good backup tool. In September 2007, Obnam was working, but it was not very good. For example, it was quite slow and wasteful of backup space. That version of Obnam used deltas, based on the rsync algorithm, to backup only changes. It did not require the user to do full and incremental backups manually, but essentially created an endless sequence of incrementals. It was possible to remove any generation, and Obnam would manage the deltas as necessary, keeping the ones needed for the remaining generations, and removing the rest. Obnam made it look as if each generation was independent of each other. The wasteful part was the way in which metadata about files was stored: each generation stored the full list of filenames and their permissions and other inode fields. This turned out to be bigger than my daily delta. The lost years; getting lost in the forest For the next two years, I did a little work on Obnam, but I did not make progress very fast. I changed the way metadata was stored, for example, but I picked another bad way of doing it: the new way was essentially building a tree of directory and file nodes, and any unchanged subtrees were shared between generations. This reduced the space overhead per generation, but made it quite slow to look up the metadata for any one file. The final battle; finding cows in the forest In 2009 I decided to leave Canonical and after that, my Obnam hobby picked up in speed again. Below is a table of the number of commits per year, from the very first commit (bzr log -n0 awk '/timestamp:/ print $3 ' sed 's/-.*//' uniq -c awk ' print $2, $1 ' tac):
2006 466
2007 353
2008 402
2009 467
2010 616
2011 790
2012 282
During most of 2010 and 2011 I was unemployed, and happily hacking Obnam, while moving to another country twice. I don't recommend that as a way to hack on hobby projects, but it worked for me. After Canonical, I decided to tackle the way Obnam stores data from a new angle. Richard told me about the copy-on-write (or COW) B-trees that btrfs uses, originally designed by Ohad Rodeh (see his paper for details), and I started reading about that. It turned out that they're pretty ideal for backups: each B-tree stores data about one generation. To start a new generation, you clone the previous generation's B-tree, and make any modifications you need. I implemented the B-tree library myself, in Python. I wanted something that was flexible about how and where I stored data, which the btrfs implementation did not seem to give me. (Also, I worship at the altar of NIH.) With the B-trees, doing file deltas from the previous generation no longer made any sense. I realized that it was, in any case, a better idea to store file data in chunks, and re-use chunks in different generations as needed. This makes it much easier to manage changes to files: with deltas, you need to keep a long chain of deltas and apply many deltas to reconstruct a particular version. With lists of chunks, you just get the chunks you need. The spin-off franchise; lost in a maze of dependencies, all alike In the process of developing Obnam, I have split off a number of helper programs and libraries: I have found it convenient to keep these split off, since I've been able to use them in other projects as well. However, it turns out that those installing Obnam don't like this: it would probably make sense to have a fat release with Obnam and all dependencies, but I haven't bothered to do that yet. The blurb; readers advised about blatant marketing The strong points of Obnam are, I think: Backups may be stored on local hard disks (e.g., USB drives), any locally mounted network file shares (NFS, SMB, almost anything with remotely Posix-like semantics), or on any SFTP server you have access to. What's not so strong is backing up online over SFTP, particularly with long round trip times to the server, or many small files to back up. That performance is Obnam's weakest part. I hope to fix that in the future, but I don't want to delay 1.0 for it. The big news; readers sighing in relief I am now ready to release version 1.0 of Obnam. Finally. It's been a long project, much longer than I expected, and much longer than was really sensible. However, it's ready now. It's not bug free, and it's not as fast as I would like, but it's time to declare it ready for general use. If nothing else, this will get more people to use it, and they'll find the remaining problems faster than I can do on my own. I have packaged Obnam for Debian, and it is in unstable, and will hopefully get into wheezy before the Debian freeze. I provide packages built for squeeze on my own repository, see the download page. The changes in the 1.0 release compared to the previous one: The future; not including winning lottery numbers I expect to get a flurry of bug reports in the near future as new people try Obnam. It will take a bit of effort dealing with that. Help is, of course, welcome! After that, I expect to be mainly working on Obnam performance for the foreseeable future. There may also be a FUSE filesystem interface for restoring from backups, and a continous backup version of Obnam. Plus other features, too. I make no promises about how fast new features and optimizations will happen: Obnam is a hobby project for me, and I work on it only in my free time. Also, I have a bunch of things that are on hold until I get Obnam into shape, and I may decide to do one of those things before the next big Obnam push. Where; the trail of an errant hacker I've developed Obnam in a number of physical locations, and I thought it might be interesting to list them: Espoo, Helsinki, Vantaa, Kotka, Raahe, Oulu, Tampere, Cambridge, Boston, Plymouth, London, Los Angeles, Auckland, Wellington, Christchurch, Portland, New York, Edinburgh, Manchester, San Giorgio di Piano. I've also hacked on Obnam in trains, on planes, and once on a ship, but only for a few minutes on the ship before I got seasick. Thank you; sincerely SEE ALSO

13 March 2011

Lars Wirzenius: DPL elections: candidate counts

Out of curiosity, and because it is Sunday morning and I have a cold and can't get my brain to do anything tricky, I counted the number of candidates in each year's DPL elections.
Year Count Names
1999 4 Joseph Carter, Ben Collins, Wichert Akkerman, Richard Braakman
2000 4 Ben Collins, Wichert Akkerman, Joel Klecker, Matthew Vernon
2001 4 Branden Robinson, Anand Kumria, Ben Collins, Bdale Garbee
2002 3 Branden Robinson, Rapha l Hertzog, Bdale Garbee
2003 4 Moshe Zadka, Bdale Garbee, Branden Robinson, Martin Michlmayr
2004 3 Martin Michlmayr, Gergely Nagy, Branden Robinson
2005 6 Matthew Garrett, Andreas Schuldei, Angus Lees, Anthony Towns, Jonathan Walther, Branden Robinson
2006 7 Jeroen van Wolffelaar, Ari Pollak, Steve McIntyre, Anthony Towns, Andreas Schuldei, Jonathan (Ted) Walther, Bill Allombert
2007 8 Wouter Verhelst, Aigars Mahinovs, Gustavo Franco, Sam Hocevar, Steve McIntyre, Rapha l Hertzog, Anthony Towns, Simon Richter
2008 3 Marc Brockschmidt, Rapha l Hertzog, Steve McIntyre
2009 2 Stefano Zacchiroli, Steve McIntyre
2010 4 Stefano Zacchiroli, Wouter Verhelst, Charles Plessy, Margarita Manterola
2011 1 Stefano Zacchiroli (no vote yet)
Winner indicate by boldface. I expect Zack to win over "None Of The Above", so I went ahead and boldfaced him already, even if there has not been a vote for this year. Median number of candidates is 4.

1 March 2008

Anthony Towns: Been a while...

So, sometime over the past few weeks I clocked up ten years as a Debian developer:
From: Anthony Towns <aj@humbug.org.au>
Subject: Wannabe maintainer.
Date: Sun, 8 Feb 1998 18:35:28 +1000 (EST)
To: new-maintainer@debian.org
Hello world,
I'd like to become a debian maintainer.
I'd like an account on master, and for it to be subscribed to the
debian-private list.
My preferred login on master would have been aj, but as that's taken
ajt or atowns would be great.
I've run a debian system at home for half a year, and a system at work
for about two months. I've run Linux for two and a half years at home,
two years at work. I've been active in my local linux users' group for
just over a year. I've written a few programs, and am part way through
packaging the distributed.net personal proxy for Debian (pending
approval for non-free distribution from distributed.net).
I've read the Debian Social Contract.
My PGP public key is attached, and also available as
<http://azure.humbug.org.au/~aj/aj_key.asc>.
If there's anything more you need to know, please email me.
Thanks in advance.
Cheers,
aj
-- 
Anthony Towns <aj@humbug.org.au> <http://azure.humbug.org.au/~aj/>
I don't speak for anyone save myself. PGP encrypted mail preferred.
On Netscape GPLing their browser:  How can you trust a browser that
ANYONE can hack? For the secure choice, choose Microsoft.''
        -- <oryx@pobox.com> in a comment on slashdot.org
Apparently that also means I’ve clocked up ten and a half years as a Debian user; I think my previous two years of Linux (mid-95 to mid-97) were split between Slackware and Red Hat, though I couldn’t say for sure at this point. There’s already been a few other grand ten-year reviews, such as Joey Hess’s twenty-part serial, or LWN’s week-by-week review, or ONLamp’s interview with Bruce Perens, Eric Raymond and Michael Tiemann on ten years of “open source”. I don’t think I’m going to try matching that sort of depth though, so here are some of my highlights (after the break).
Hrm, this is going on longer than I’d hoped. Oh well, to be continued!