I know I don't mention a season, and I'm a few hours late for hallowe'en, but here's a haiku about Haiku:
A death, once again,
The master sighs, and fixes,
It rises up, undead.
A death, once again,
The master sighs, and fixes,
It rises up, undead.
Well 123-reg mostly I think you don't know how to do email.
Date: Wed, 01 Jul 2015 06:13:16 -0000 From: 123-reg <firstname.lastname@example.org> To: email@example.com Subject: Tell us what you think for your chance to win X-Mailer: MIME::Lite 3.027 (F2.74; T1.28; A2.04; B3.13; Q3.13) Tell us what you think of 123-reg! <!-- .style1 color: #1996d8 -->
curl sudo bash -. We, the OS-development literati, have come out in droves to say "eww, nasty, don't do that please" and yet we have brought this upon ourselves. Our tendency to invent, and reinvent, at the very basic levels of distributions has resulted in so many operating systems and so many ways to package software (if not in underlying package format then in policy and process) that third party application authors simply cannot keep up. Couple that with the desire of the consumers to not have their chosen platform discounted, and if you provide Debian packages, you end up needing to provide for Fedora, RHEL, SuSE, SLES, CentOS, Mint, Gentoo, Arch, etc.etc; let alone supporting all the various BSDs. This leads to the simple expedience of
curl sudo bash -. Nobody, not even those who are most vehemently against this mechanism of installing software, can claim that it is not quick, simple for users, easy to copy/paste out of a web-page, and leaves all the icky complexity of sorting things out up to a script which the computer can run, rather than the nascent user of the software in question. As a result, many varieties of software have ended up using this as a simple installation mechanism, from games to orchestration frameworks - everyone can acknowledge how easy it is to use. Now, some providers are wising up a little and ensuring that the url you are
curling is at least an
https://one. Some even omit the
sudofrom the copy/paste space and have it in the script, allowing them to display some basic information and prompting the user that this will occur as root before going ahead and elevating. All of these myriad little tweaks to the fundamental idea improve matters but are ultimately just putting lipstick on a fairly sad looking pig. So, what can be done? Well we (again the OS-development literati) got ourselves into this horrendous mess, so it's up to us to get ourselves back out. We're all too entrenched in our chosen packaging methodologies, processes, and policies, to back out of those; yet we're clearly not properly servicing a non-trivial segment of our userbase. We need to do better. Not everyone who currently honours a
curl sudo bash -is capable of understanding why it's such a bad idea to do so. Some education may reduce that number but it will never eliminate it. For a long time I advocated a switch to
wget && review && sudo ./scriptapproach instead, but the above comment, about people who don't understand why it might be a bad idea, really applies to show how few of those users would even be capable of starting to review a script they downloaded, let alone able to usefully judge for themselves if it is really safe to run. Instead we need something better, something collaborative, something capable of solving the accessibility issues which led to the
curl sudo bash -revolt in the first place.
curl sudo bash -or easier. This might mean a particular URI format which can have os-specific ways to handle standardised inputs, or it might mean a pervasive tool which does something like that.
curl sudo bash -as many platforms as possible need to get involved. This means not only Debian, Ubuntu, Fedora and SuSE; but also Arch, FreeBSD, NetBSD, CentOS etc. Maybe even the OpenSolaris/Illumos people need to get involved.
curl sudo bash -:- just like we can never seem to be rid of that one odd person at the party, noone knows who invited them, and noone wants to tell them to leave because they do fill a needed role, but noone really seems to like. Until then, let's suck it up and while we might not like it, let's just let people keep on
curl sudo bash -ing until someone gets hurt.
curl sudo bash -for the record.
open(2)system call with the
O_EXCLflag to create the target file. This should allow things to work with both NFS and VFAT.
--log=syslogmessage format improvement by Daniel Silverstone. No longer includes a timestamp, since syslog adds it anyway. Also, the process name is now set on Linux.
help, unless one already exists.
I'm really sorry to inform you that my supervisor refused the share transfer request to the other account. He told me that we do not transfer products either in those cases. I deeply apologize since I thought it was possible. I thank you for your confidence in Gandi, and hope that this occurrence does not deter you from continuing with us.Now I must stress that Gandi are completely within their contractual rights to do this and are not obligated to provide any assistance to help make my blunder less costly to me, but they have firmly shown they have no willingness to be flexible in any way which means I know where I will not be spending my money in future.
rsync, of course, was the obvious choice, but there were others. I ended up doing what many geeks do: I wrote my own wrapper around
rsync. There's hundred, possibly thousands, of such wrappers around the Internet. I also got the idea that doing a startup to provide online backup space would be a really cool thing. However, I didn't really do anything about that until 2007. More on that later. The
rsyncwrapper script I wrote used hardlinked directory trees to provide a backup history, though not in the smart way that backuppc does it. The hardlinks were wonderful, because they were cheap, and provided de-duplication. They were also quite cumbersome, when I needed to move my backups to a new disk the first time. It turned out that a lot of tools deal very badly with directory trees with large numbers of hardlinks. I also decided I wanted encrypted backups. This led me to find duplicity, which is a nice program that does encrypted backups, but I had issues with some of its limitations. To fix those limitations, I would have had to re-design and possibly re-implement the entire program. The biggest limitation was that it treated backups as full backup, plus a sequence of incremental backups, which were deltas against the previous backup. Delta based incrementals make sense for tape drives. You run a full backup once, then incremental deltas for every day. When enough time has passed since the full backup, you do a new full backup, and then future incrementals are based on that. Repeat forever. I decided that this makes no sense for disk based backups. If I already have backed up a file, there's no point in making me backup it again, since it's already there on the same hard disk. It makes even less sense for online backups, since doing a new full backup would require me to transmit all the data all over again, even though it's already on the server. The first battle I could not find a program that did what I wanted to do, and like every good NIHolic, I started writing my own. After various aborted attempts, I started for real in 2006. Here is the first commit message:
revno: 1 committer: Lars Wirzenius <firstname.lastname@example.org> branch nick: wibbr timestamp: Wed 2006-09-06 18:35:52 +0300 message: Initial commit.
wibbrwas the placeholder name for Obnam until we came up with something better. We was myself and Richard Braakman, who was going to be doing the backup startup with me. We eventually founded the company near the end of 2006, and started doing business in 2007. However, we did not do very much business, and ran out of money in September 2007. We ended the backup startup experiment. That's when I took a job with Canonical, and Obnam became a hobby project of mine: I still wanted a good backup tool. In September 2007, Obnam was working, but it was not very good. For example, it was quite slow and wasteful of backup space. That version of Obnam used deltas, based on the
rsyncalgorithm, to backup only changes. It did not require the user to do full and incremental backups manually, but essentially created an endless sequence of incrementals. It was possible to remove any generation, and Obnam would manage the deltas as necessary, keeping the ones needed for the remaining generations, and removing the rest. Obnam made it look as if each generation was independent of each other. The wasteful part was the way in which metadata about files was stored: each generation stored the full list of filenames and their permissions and other inode fields. This turned out to be bigger than my daily delta. The lost years; getting lost in the forest For the next two years, I did a little work on Obnam, but I did not make progress very fast. I changed the way metadata was stored, for example, but I picked another bad way of doing it: the new way was essentially building a tree of directory and file nodes, and any unchanged subtrees were shared between generations. This reduced the space overhead per generation, but made it quite slow to look up the metadata for any one file. The final battle; finding cows in the forest In 2009 I decided to leave Canonical and after that, my Obnam hobby picked up in speed again. Below is a table of the number of commits per year, from the very first commit (
bzr log -n0 awk '/timestamp:/ print $3 ' sed 's/-.*//' uniq -c awk ' print $2, $1 ' tac):
During most of 2010 and 2011 I was unemployed, and happily hacking Obnam, while moving to another country twice. I don't recommend that as a way to hack on hobby projects, but it worked for me. After Canonical, I decided to tackle the way Obnam stores data from a new angle. Richard told me about the copy-on-write (or COW) B-trees that btrfs uses, originally designed by Ohad Rodeh (see his paper for details), and I started reading about that. It turned out that they're pretty ideal for backups: each B-tree stores data about one generation. To start a new generation, you clone the previous generation's B-tree, and make any modifications you need. I implemented the B-tree library myself, in Python. I wanted something that was flexible about how and where I stored data, which the btrfs implementation did not seem to give me. (Also, I worship at the altar of NIH.) With the B-trees, doing file deltas from the previous generation no longer made any sense. I realized that it was, in any case, a better idea to store file data in chunks, and re-use chunks in different generations as needed. This makes it much easier to manage changes to files: with deltas, you need to keep a long chain of deltas and apply many deltas to reconstruct a particular version. With lists of chunks, you just get the chunks you need. The spin-off franchise; lost in a maze of dependencies, all alike In the process of developing Obnam, I have split off a number of helper programs and libraries:
2006 466 2007 353 2008 402 2009 467 2010 616 2011 790 2012 282
md5sumon steroids), useful for verifying that files are restored correctly
unstable, and will hopefully get into
wheezybefore the Debian freeze. I provide packages built for
squeezeon my own repository, see the download page. The changes in the 1.0 release compared to the previous one:
.desktopfile before the executable or its data files are unpacked, and a user may notice the program in their menu and start it, resulting in an error. Upgrades suffer from additonal problems. Software that gets upgraded may be running during the upgrade. Should the package manager replace the software's data files with new versions, which may be in a format that the old program does not understand? Or install new plugins that will cause the old version of the program to segfault? If the package manager does that, users may experience turbulence without having put on their seat belts. If it doesn't do that, it can't install the package, or it needs to wait, perhaps for a very long time, for a safe time to do the upgrade. These problems have usually been either ignored, or solved by using package specific hacks. For example, plugins might be stored in a directory that embeds the program's version number, ensuring that the old version won't see the new plugins. Some people would like to apply installs and upgrades only at shutdown or bootup, but that has other problems. None of the hacks solve the downgrade problem. The package managers can replace a package with an older version, and often this works well. However, in many cases, any package maintainer scripts won't be able to deal with downgrades. For example, they might convert data files to a new format or name or location upon upgrades, but won't try to undo that if the package gets downgraded. Given the combinatorial explosion of package versions, it's perhaps just as well that they don't try. For Baserock, we absolutely need to have downgrades. We need to be able to go back to a previous version of the system if an upgrade fails. Traditionally, this has been done by providing a "factory reset", where the current version of the system gets replaced with whatever version was installed in the factory. We want that, but we also want to be able to choose other versions, not just the factory one. If a device is running version X, and upgrades to X+1, but that version turns out to be a dud, we want to be able to go back to X, rather than all the way back to the factory version. The approach we'll be taking with Baserock relies on btrfs and subvolumes and snapshots. Each version of the system will be installed in a separate subvolume, which gets cloned from the previous version, using copy-on-write to conserve space. We'll make the bootloader be able to choose a version of the system to boot, and (waving hands here) add some logic to be able to automatically revert to the previous version when necessary. We expect this to work better and more reliably than the current package based one. Making choices Debian is pretty bad at making choices. Almost always, when faced with a need to choose between alternative solutions for the same problem, we choose all of them. For example, we support pretty much every
initimplementation, various implementations of
/bin/sh, and we even have at least three entirely different kernels. Sometimes this is non-choice is a good thing. Our users may need features that only one of the kernels support, for example. And we certainly need to be able to provide both mysql and postresql, since various software we want to provide to our uses needs one and won't work with the other. At other times, the inability to choose causes trouble. Do we really need to support more than one implemenation of
/bin/sh? By supporting both dash and bash for that, we double the load on testing and QA, and introduce yet another variable to deal with into any debugging situation involving shell scripts. Especially for core components of the system, it makes sense to limit the flexibility of users to pick and choose. Combinatorial explosion d j vu. Every binary choice doubles the number of possible combinations that need to be tested and supported and checked during debugging. Flexibility begets complexity, complexity begets problems. This is less of a problem at upper levels of the software stack. At the very top level, it doesn't really matter if there are many choices. If a user can freely choose between vi and Emacs, and this does not add complexity at the system level, since nothing else is affected by that choice. However, if we were to add a choice between glibc, eglibc, and uClibc for the system C library, then everything else in the system needs to be tested three times rather than once. Reducing the friction coefficient for system development Currently, a Debian developer takes upstream code, adds packaging, perhaps adds some patches (using one of several methods), builds a binary package, tests it, uploads it, and waits for the build daemons and the package archive and user-testers to report any problems. That's quite a number of steps to go through for the simple act of adding a new program to Debian, or updating it to a new version. Some of it can be automated, but there's still hoops to jump through. Friction does not prevent you from getting stuff done, but the more friction there is, the more energy you have to spend to get it done. Friction slows down the crucial hack-build-test cycle of software development, and that hurts productivity a lot. Every time a developer has to jump through any hoops, or wait for anything, he slows down. It is, of course, not just a matter of the number of steps. Debian requires a source package to be uploaded with the binary package. Many, if not most, packages in Debian are maintained using version control systems. Having to generate a source package and then wait for it to be uploaded is unnecessary work. The build daemon could get the source from version control directly. With signed commits, this is as safe as uploading a tarball. The above examples are specific to maintaining a single package. The friction that really hurts Debian is the friction of making large-scale changes, or changes that affect many packages. I've already mention the difficulty of making large transitions above. Another case is making policy changes, and then implementing them. An excellent example of that is in Debian is the policy change to use
/usr/share/docfor documentation, instead of
/usr/doc. This took us many years to do. We are, I think, perhaps a little better at such things now, but even so, it is something that should not take more than a few days to implement, rather than half a decade. On the future of distributions Occasionally, people say things like "distributions are not needed", or that "distributions are an unnecessary buffer between upstream developers and users". Some even claim that there should only be one distribution. I disagree. A common view of a Linux distribution is that it takes some source provided by upstream, compiles that, adds an installer, and gives all of that to the users. This view is too simplistic. The important part of developing a distribution is choosing the upstream projects and their versions wisely, and then integrating them into a whole system that works well. The integration part is particularly important. Many upstreams are not even aware of each other, nor should they need to be, even if their software may need to interact with each other. For example, not every developer of HTTP servers should need to be aware of every web application, or vice versa. (It they had to be, it'd be a combinatorial explosion that'd ruin everything, again.) Instead, someone needs to set a policy of how web apps and web servers interface, what their common interface is, and what files should be put where, for web apps to work out of the box, with minimal fuss for the users. That's part of the integration work that goes into a Linux distribution. For Debian, such decisions are recorded in the Policy Manual and its various sub-policies. Further, distributions provide quality assurance, particularly at the system level. It's not realistic to expect most upstream projects to do that. It's a whole different skillset and approach that is needed to develop a system, rather than just a single component. Distributions also provide user support, security support, longer term support than many upstreams, and port software to a much wider range of architectures and platforms than most upstreams actively care about, have access to, or even know about. In some cases, these are things that can and should be done in collaboration with upstreams; if nothing else, portability fixes should be given back to upstreams. So I do think distributions have a bright future, but the way they're working will need to change.