I've recently returned from
ICFP 2010 (
International
conference on functional programming). This was the second time I
went there, and like last year, it was a very interesting
conference. This year, I've even managed to do one keysigning with a
fellow Debian Developer (hello Sylvain!)
While there were many talks, I only mention a few key ones, that I
have found very cool.
Note: I'm a layman in this field. The text below is my own
interpretation of the talks and contains many errors and wrong
facts corrections are welcome!
Workshop on ML
I wasn't present at this workshop, but other people pointed me to this
very cool talk:
Mirage project
The abstract, to be found at
http://www.cs.rit.edu/~mtf/ml2010/program.html#Program (search for
Mirage), says:
Our framework, dubbed Mirage, extends the Objective Caml language with
I/O extensions and a custom runtime to emit binaries that execute as a
paravirtual operating system directly under Xen.
The custom run-time is significantly simpler than a general-purpose
operating system, with a static single-process 64-bit address space,
and "zero-copy" I/O that works directly with the OCaml garbage
collector. As a result, Mirage applications exhibit significant
performance speedups for I/O and memory handling versus the same code
running under Linux/Xen.
It's extremely interesting, IMHO. You write your code as a
more-or-less normal OCaml program, but you compile it with this
special tool-chain, and you boot it directly on ton of Xen, as a domU,
without the intermediate OS.
The project Git repo is at
http://github.com/avsm/mirage, and that
links a bit older presentation
(
http://anil.recoil.org/papers/2010-hotcloud-lamp.pdf) which IMHO is
a required read for anyone interested in cloud-related stuff.
ICFP proper
ICFP, the main conference, is as usual, full of math, designed to
scare people away. Err, I mean, designed to tell about new
developments in this field.
The papers are gathered in this reddit post:
http://www.reddit.com/r/haskell/comments/dji4v/papers_from_icfp_2010/.
XenServer
The paper
http://anil.recoil.org/papers/2010-icfp-xen.pdf describes
the XenServer project, and more specifically the management interface,
which is written entirely in OCaml, by a team of ~10 people.
They also have a bit of bin-packing code
(
http://xenbits.xen.org/xapi/xen-api.hg?file/95b9e4f1b9dd/ocaml/xapi/binpack.ml),
but much simpler than the
ganeti-htools
project on which I'm working
(no balancing or similar). On the other hand, it is interesting to see
that they committed themselves to write the entire management suite in
OCaml, as opposed to our Python plus just a few bits of Haskell. I
envy them, very much so
The fun tidbit was how stable they found the OCaml tool-chain. In
three years, they had just one bug related to this (the tool-chain), a
segfault due to wrong formatting code in the handling path of a
seldom-triggered Python exception, which led to a
%s
to be
translated as is to C code, which meant
printf
got the wrong
specifiers. So in an OCaml project, a serious bug was due to lack of
type checking in string formatting in a Python component
Overall I have found interesting parallels between their experience
and my own with Haskell; from is this language/tool-chain stable
enough for use to concerns about using FP for a significant project
(but of course, orders of magnitude difference between their case and
mine).
Super-compilation
There were two papers on this topic;
one at ICFP
proper and
one
at the Haskell symposium.
If I understood things right, super-compilation is the opposite of lazy
evaluation. But let's start a bit back.
First, programs written in a functional languages contain many pure
function. Pure functions, being side-effect-free, exhibit many
interesting properties, and one of them is that they permit decoupling
of the computation of a value from the declaration of such computation.
One technique derived from this is lazy computations. Lazy computation
means that even though you write
y = sin(x) + 1
, the actual call to
sin
and the increment is not actually done until you
need the
actual value of
y
(not just for storing it in
y
, that is).
Super-compilation is the opposite of this. While laziness postpones
doing a computation to as late as possible, in the hope it won't be
needed at all, super-compilation runs the actual computation way ahead
of time, even before the program starts, that is at compile
time. Since programs use input data at runtime, which is not known at
compile time, the trick is to pre-compute as much as possible at this
stage, given incomplete information; and the papers showed that it's
possible to pre-compute much more than you'd usually believe.
Optimising compilers already do some form of CAF (constant applicative
form) optimisation, but the techniques presented in both papers are
going beyond simple CAFs. The details are in the papers, and are
complicated, but one simple example is converting this:
ones = map inc [1..]
(note that is an infinite list, and you don't want to fully compute
that at any time, neither at compile nor at runtime), into:
ones = [2..]
The details on how this is done depend on the exact super-compilation
implementation, and not all implementations can do the same
optimisations, but in the presentations at this ICFP the idea was to
split things into static/non-static parts, and build on that, or to do
partial evaluation, etc.
The result is that super-compilation shows significant increases in
speed (in some synthetic benchmarks up to 80%), with a few benchmarks
not being sped up at all, but most are. The downside is an explosion
in code size, as many more things are pre-computed. Still, as a
research topic, this is very interesting.
My own paper
Surprisingly, it seems I survived giving my talk (just an experience
report, not actually fancy stuff like the rest). Either the people
actually found it interesting (less likely), or they were very polite
(more probably). Anyway, the paper is on the
papers page
(shameless self-promotion) (if you care).
Haskell symposium
This is co-located with ICFP, and is more practical than ICFP (which
is, as I said, way too much math). The papers are again available on
reddit:
http://www.reddit.com/r/haskell/comments/dkzn4/papers_from_the_haskell_symposium_2010/.
Orc: Concurrent Orchestration in Haskell
How to easily write code for synchronising concurrent work? As an
example, how complicated is to solve this problem:
[from the Orc literature] we might wish to contact two airlines
simultaneously seeking price quotes. If either quote comes back
below a threshold price, say $300, then let's buy a ticket
immediately. On the other hand if both quotes exceed the
threshold, then buy the cheapest ticket. Additionally, buy a
ticket if the other airline does not give a timely quote, or
notify the user if neither airline provides a timely quote.
Now, in my experience, this would require manual management and
synchronisation of threads, and for example in Python the early abort
of querying the price from airline B if airline A returned a good
price would be non-trivial; in addition, adding two extra conditions
would be again not a simple composition of rules. In Orc however, it
takes all of 10 lines of code to do this, safely. I won't copy the
code here, see the
paper for
details, or the
API docs.
And the best part is, this is just a
cabal install orc
away! I'll be
very interested to see how I can use this in practice, as it is seems
indeed a very powerful library.
Nikola: Embedding Compiled GPU Functions in Haskell
Interesting
project. Using
almost 100% Haskell code, and it marshalls the data to/from the GPU;
in the example given, the performance is as good as hand-written CUDA
code of course, this doesn't happen in all cases. It it's interesting
how they manage to compile the GPU-related code to actual object code
at Haskell compile time, which makes it transparent for the user.
Given the prevalence of high-speed GPUs, this could be a nice way to
utilise them without learning yet another language (or two, since
NVidia and ATI have different stacks).
They've also developed some interesting techniques (e.g. manual
control of sharing, to prevent inlining) as part of their project.
Scalable Event Handling for GHC
This is a
paper
co-authored by Johan Tibell of Google Switzerland and Bryan O'Sullivan
of Real World Haskell fame.
Very interesting serving 20K QPS from a Haskell process while having
16K+ idle clients connected:
[ ] the performance of the epoll back end does not begin to fall
off significantly until we have 50,000 idle connections open.
The talk started with the idea that we're on the path to serving 10
clients from a single machine, and how to improve Haskell's I/O
manager to achieve (almost) 10 concurrent clients. The slides (which
I don't have a link for) were a bit more cool on numbers than the
paper.
Oh, and this will come with GHC 7.0, for free
An LLVM Backend For GHC
The eternal fight to get better than the C backend in other words, to
not have to translate Haskell to C It is interesting to see how many
languages target LLVM nowadays, as opposed to plain old C via gcc. A
fun trivia item was that, according to the author, there are many
features in LLVM that are not actually used by 90% of the languages
targeting it (e.g. the garbage collector) (if I heard correctly in the
talk).
The results are interesting, although not too much overall speedup is
gained; this seems to be due to the hard-to-optimise Cmm intermediate
language inside GHC. In any case, a few benchmarks get more than 10%
speedup, even in the current form.
This will also come in GHC 7.0, although not enabled by default;
probably needs to mature some more
Conclusion
I've enjoyed this conference a lot, and I've learned a lot from
it. Given that many of the talks are about research topics, it's
harder to apply them in day-to-day work, but I think the pay-offs will
be long term rather than immediate.