Search Results: "nomeata"

28 July 2017

Joachim Breitner: How is coinduction the dual of induction?

Earlier today, I demonstrated how to work with coinduction in the theorem provers Isabelle, Coq and Agda, with a very simple example. This reminded me of a discussion I had in Karlsruhe with my then colleague Denis Lohner: If coinduction is the dual of induction, why do the induction principles look so different? I like what we observed there, so I d like to share this. The following is mostly based on my naive understanding of coinduction based on what I observe in the implementation in Isabelle. I am sure that a different, more categorial presentation of datatypes (as initial resp. terminal objects in some category of algebras) makes the duality more obvious, but that does not necessarily help the working Isabelle user who wants to make sense of coninduction.

Inductive lists I will use the usual polymorphic list data type as an example. So on the one hand, we have normal, finite inductive lists:
datatype 'a list = nil   cons (hd : 'a) (tl : "'a list")
with the well-known induction principle that many of my readers know by heart (syntax slightly un-isabellized):
P nil   ( x xs. P xs   P (cons x xs))     xs. P xs

Coinductive lists In contrast, if we define our lists coinductively to get possibly infinite, Haskell-style lists, by writing
codatatype 'a llist = lnil   lcons (hd : 'a)  (tl : "'a llist")
we get the following coinduction principle:
(  xs ys.
    R xs ys'   (xs = lnil) = (ys = lnil)  
               (xs   lnil   ys'   lnil  
	         hd xs = hd ys   R (tl xs) (tl ys)))  
  (  xs ys. R xs ys   xs = ys)
This is less scary that it looks at first. It tell you if you give me a relation R between lists which implies that either both lists are empty or both lists are nonempty, and furthermore if both are non-empty, that they have the same head and tails related by R, then any two lists related by R are actually equal. If you think of the infinte list as a series of states of a computer program, then this is nothing else than a bisimulation. So we have two proof principles, both of which make intuitive sense. But how are they related? They look very different! In one, we have a predicate P, in the other a relation R, to point out just one difference.

Relation induction To see how they are dual to each other, we have to recognize that both these theorems are actually specializations of a more general (co)induction principle. The datatype declaration automatically creates a relator:
rel_list :: ('a   'b   bool)   'a list   'b list   bool
The definition of rel_list R xs ys is that xs and ys have the same shape (i.e. length), and that the corresponding elements are pairwise related by R. You might have defined this relation yourself at some time, and if so, you probably introduced it as an inductive predicate. So it is not surprising that the following induction principle characterizes this relation:
Q nil nil  
( x xs y ys. R x y   Q xs ys   Q (cons x xs) (cons y ys))  
( xs ys   rel_list R xs ys   Q xs ys)
Note how how similar this lemma is in shape to the normal induction for lists above! And indeed, if we choose Q xs ys (P xs xs = ys) and R x y (x = y), then we obtain exactly that. In that sense, the relation induction is a generalization of the normal induction.

Relation coinduction The same observation can be made in the coinductive world. Here, as well, the codatatype declaration introduces a function
rel_llist :: ('a   'b   bool)   'a llist   'b llist   bool
which relates lists of the same shape with related elements only that this one also relates infinite lists, and therefore is a coinductive relation. The corresponding rule for proof by coinduction is not surprising and should remind you of bisimulation, too:
( xs ys.
    R xs ys   (xs = lnil) = (ys = lnil)  
              (xs   lnil   ys   lnil  
	        Q (hd xs) (hd ys)   R (tl xs) (tl ys)))  
(  xs ys   R xs ys   rel_llist Q xs ys)
It is even more obvious that this is a generalization of the standard coinduction principle shown above: Just instantiate Q with equality, which turns rel_llist Q into equality on the lists, and you have the theorem above.

The duality With our induction and coinduction principle generalized to relations, suddenly a duality emerges: If you turn around the implication in the conclusion of one you get the conclusion of the other one. This is an example of cosomething is something with arrows reversed . But what about the premise(s) of the rules? What happens if we turn around the arrow here? Although slighty less immediate, it turns out that they are the same as well. To see that, we start with the premise of the coinduction rule, reverse the implication and then show that to be equivalent to the two premises of the induction rule:
( xs ys.
    R xs ys   (xs = lnil) = (ys = lnil)  
              (xs   lnil   ys   lnil  
	        Q (hd xs) (hd ys)   R (tl xs) (tl ys)))
=   case analysis (the other two cases are vacuously true)  
  ( xs ys.
    xs = lnil   ys = lnil  
    R xs ys   (xs = lnil) = (ys = lnil)  
              (xs   lnil   ys   lnil  
	        Q (hd xs) (hd ys)   R (tl xs) (tl ys)))
  ( xs ys.
    xs   lnil   ys   lnil
    R xs ys   (xs = lnil) = (ys = lnil)  
              (xs   lnil   ys   lnil  
	        Q (hd xs) (hd ys)   R (tl xs) (tl ys)))
=   simplification  
  ( xs ys.  xs = lnil   ys = lnil   R xs ys
  ( x xs y ys.  R (cons x xs) (cons y ys)   (Q x y   R xs ys))
=   more rewriting  
  R nil nil
  ( x xs y ys. Q x y   R xs ys   R (cons x xs) (cons y ys))

Conclusion The coinduction rule is not the direct dual of the induction rule, but both are specializations of more general, relational proof methods, where the duality is clearly present. More generally, this little excursion shows that it is often beneficial to think of types less as sets, and more as relations this way of thinking is surprisingly fruitful, and led to proofs of parametricity and free theorems and other nice things.

27 July 2017

Joachim Breitner: Coinduction in Coq and Isabelle

The DeepSpec Summer School is almost over, and I have had a few good discussions. One revolved around coinduction: What is it, how does it differ from induction, and how do you actually prove something. In the course of the discussion, I came up with a very simple coinductive exercise, and solved it both in Coq and Isabelle

The task Define the extended natural numbers coinductively. Define the min function and the relation. Show that min(n, m) n holds.

Coq The definitions are straight forward. Note that in Coq, we use the same command to define a coinductive data type and a coinductively defined relation:
CoInductive ENat :=
    N : ENat
    S : ENat -> ENat.
CoFixpoint min (n : ENat) (m : ENat)
  :=match n, m with   S n', S m' => S (min n' m')
                      _, _       => N end.
CoInductive le : ENat -> ENat -> Prop :=
    leN : forall m, le N m
    leS : forall n m, le n m -> le (S n) (S m).
The lemma is specified as
Lemma min_le: forall n m, le (min n m) n.
and the proof method of choice to show that some coinductive relation holds, is cofix. One would wish that the following proof would work:
Lemma min_le: forall n m, le (min n m) n.
Proof.
  cofix.
  destruct n, m.
  * apply leN.
  * apply leN.
  * apply leN.
  * apply leS.
    apply min_le.
Qed.
but we get the error message
Error:
In environment
min_le : forall n m : ENat, le (min n m) n
Unable to unify "le N ?M170" with "le (min N N) N
Effectively, as Coq is trying to figure out whether our proof is correct, i.e. type-checks, it stumbled on the equation min N N = N, and like a kid scared of coinduction, it did not dare to run the min function. The reason it does not just run a CoFixpoint is that doing so too daringly might simply not terminate. So, as Adam explains in a chapter of his book, Coq reduces a cofixpoint only when it is the scrutinee of a match statement. So we need to get a match statement in place. We can do so with a helper function:
Definition evalN (n : ENat) :=
  match n with   N => N
                 S n => S n end.
Lemma evalN_eq : forall n, evalN n = n.
Proof. intros. destruct n; reflexivity. Qed.
This function does not really do anything besides nudging Coq to actually evaluate its argument to a constructor (N or S _). We can use it in the proof to guide Coq, and the following goes through:
Lemma min_le: forall n m, le (min n m) n.
Proof.
  cofix.
  destruct n, m; rewrite <- evalN_eq with (n := min _ _).
  * apply leN.
  * apply leN.
  * apply leN.
  * apply leS.
    apply min_le.
Qed.

Isabelle In Isabelle, definitions and types are very different things, so we use different commands to define ENat and le:
theory ENat imports  Main begin
codatatype ENat =  N   S  ENat
primcorec min where
   "min n m = (case n of
       N   N
       S n'   (case m of
        N   N
        S m'   S (min n' m')))"
coinductive le where
  leN: "le N m"
  leS: "le n m   le (S n) (S m)"
There are actually many ways of defining min; I chose the one most similar to the one above. For more details, see the corec tutorial. Now to the proof:
lemma min_le: "le (min n m) n"
proof (coinduction arbitrary: n m)
  case le
  show ?case
  proof(cases n)
    case N then show ?thesis by simp
  next
    case (S n') then show ?thesis
    proof(cases m)
      case N then show ?thesis by simp
    next
      case (S m')  with  n = _  show ?thesis
        unfolding min.code[where n = n and m = m]
        by auto
    qed
  qed
qed
The coinduction proof methods produces this goal:
proof (state)
goal (1 subgoal):
 1.  n m. ( m'. min n m = N   n = m')  
          ( n' m'.
               min n m = S n'  
               n = S m'  
	       (( n m. n' = min n m   m' = n)   le n' m'))
I chose to spell the proof out in the Isar proof language, where the outermost proof structure is done relatively explicity, and I proceed by case analysis mimiking the min function definition. In the cases where one argument of min is N, Isabelle s simplifier (a term rewriting tactic, so to say), can solve the goal automatically. This is because the primcorec command produces a bunch of lemmas, one of which states n = N m = N min n m = N. In the other case, we need to help Isabelle a bit to reduce the call to min (S n) (S m) using the unfolding methods, where min.code contains exactly the equation that we used to specify min. Using just unfolding min.code would send this method into a loop, so we restrict it to the concrete arguments n and m. Then auto can solve the remaining goal (despite all the existential quantifiers).

Summary Both theorem provers are able to prove the desired result. To me it seems that it is slightly more convenient in Isabelle because a lot of Coq infrastructure relies on the type checker being able to effectively evaluate expressions, which is tricky with cofixpoints, wheras evaluation plays a much less central role in Isabelle, where rewriting is the crucial technique, and while one still cannot simply throw min.code into the simpset, so working with objects that do not evaluate easily or completely is less strange.

Agda I was challenged to do it in Agda. Here it is:
module ENat where
open import Coinduction
data ENat : Set where
  N : ENat
  S :   ENat   ENat
min : ENat   ENat   ENat
min (S n') (S m') = S (  (min (  n') (  m')))
min _ _ = N
data le : ENat   ENat   Set where
  leN :    m    le N m
  leS :    n m      (le (  n) (  m))   le (S n) (S m)
min_le :    n m    le (min n m) n
min_le  S n'   S m'  = leS (  min_le)
min_le  N      S m'  = leN
min_le  S n'   N  = leN
min_le  N      N  = leN
I will refrain from commenting it, because I do not really know what I have been doing here, but it typechecks, and refer you to the official documentation on coinduction in Agda. But let me note that I wrote this using plain inductive types and recursion, and added , and until it worked.

6 July 2017

Joachim Breitner: The Micro Two Body Problem

Inspired by recent PhD comic Academic Travel and not-so-recent xkcd comic Movie Narrative Charts , I created the following graphics, which visualizes the travels of an academic couple over the course of 10 months (place names anonymized).
Two bodies traveling the world

Two bodies traveling the world

23 June 2017

Joachim Breitner: The perils of live demonstrations

Yesterday, I was giving a talk at the The South SF Bay Haskell User Group about how implementing lock-step simulation is trivial in Haskell and how Chris Smith and me are using this to make CodeWorld even more attractive to students. I gave the talk before, at Compose::Conference in New York City earlier this year, so I felt well prepared. On the flight to the West Coast I slightly extended the slides, and as I was too cheap to buy in-flight WiFi, I tested them only locally. So I arrived at the offices of Target1 in Sunnyvale, got on the WiFi, uploaded my slides, which are in fact one large interactive CodeWorld program, and tried to run it. But I got a type error Turns out that the API of CodeWorld was changed just the day before:
commit 054c811b494746ec7304c3d495675046727ab114
Author: Chris Smith <cdsmith@gmail.com>
Date:   Wed Jun 21 23:53:53 2017 +0000
    Change dilated to take one parameter.
    
    Function is nearly unused, so I'm not concerned about breakage.
    This new version better aligns with standard educational usage,
    in which "dilation" means uniform scaling.  Taken as a separate
    operation, it commutes with rotation, and preserves similarity
    of shapes, neither of which is true of scaling in general.
Ok, that was quick to fix, and the CodeWorld server started to compile my code, and compiled, and aborted. It turned out that my program, presumably the larges CodeWorld interaction out there, hit the time limit of the compiler. Luckily, Chris Smith just arrived at the venue, and he emergency-bumped the compiler time limit. The program compiled and I could start my presentation. Unfortunately, the biggest blunder was still awaiting for me. I came to the slide where two instances of pong are played over a simulated network, and my point was that the two instances are perfectly in sync. Unfortunately, they were not. I guess it did support my point that lock-step simulation can easily go wrong, but it really left me out in the rain there, and I could not explain it I did not modify this code since New York, and there it worked flawless2. In the end, I could save my face a bit by running the real pong game against an attendee over the network, and no desynchronisation could be observed there. Today I dug into it and it took me a while, and it turned out that the problem was not in CodeWorld, or the lock-step simulation code discussed in our paper about it, but in the code in my presentation that simulated the delayed network messages; in some instances it would deliver the UI events in different order to the two simulated players, and hence cause them do something different. Phew.

  1. Yes, the retail giant. Turns out that they have a small but enthusiastic Haskell-using group in their IT department.
  2. I hope the video is going to be online soon, then you can check for yourself.

5 May 2017

Joachim Breitner: Farewall green cap

For the last two years, I was known among swing dancers for my green flat cap:
Monti, a better model than me

Monti, a better model than me

This cap was very special: It was a gift from a good friend who sewed it by hand from what used to be a table cloth of my deceased granny, and it has traveled with me to many corners of the world. Just like last week, when I was in Paris where I attended the Charleston class of Romuald and Laura on Saturday (April 29). The following Tuesday I went to a Swing Social and wanted to put on the hat, and noticed that it was gone. The next day I bugged the manager and the caretaker of the venue of the class (Salles Sainte-Roche), and it seems that the hat was still there, that morning, im Salle Kurtz1, but when I went there it was gone. And that is sad.
The last picture with the hat

The last picture with the hat


  1. How fitting, given that my granny s maiden name is Kurz.

28 April 2017

Joachim Breitner: ghc-proofs rules more now

A few weeks ago I blogged about an experiment of mine, where I proved equalities of Haskell programs by (ab)using the GHC simplifier. For more details, please see that post, or the video of my talk at the Z rich Haskell User Group, but one reason why this approach has any chance of being useful is the compiler s support for rewrite rules. Rewrite rules are program equations that the programmer specifies in the source file, and which the compiler then applies, from left to right, whenever some intermediate code matches the left-hand side of the equation. One such rule, for example, is
 -# RULES "foldr/nil" forall k z. foldr k z [] = z #- 
taken right out of the standard library. In my blog post I went through the algebraic laws that a small library of mine, successors, should fulfill, and sure enough, once I got to more interesting proofs, they would not go through just like that. At that point I had to add additional rules to the file I was editing, which helped the compiler to finish the proofs. Some of these rules were simple like
 -# RULES "mapFB/id" forall c. mapFB c (\x -> x) = c #- 
 -# RULES "foldr/nil" forall k n. GHC.Base.foldr k n [] = n #- 
 -# RULES "foldr/undo" forall xs. GHC.Base.foldr (:) [] xs = xs #- 
and some are more intricate like
 -# RULES "foldr/mapFB" forall c f g n1 n2 xs.
    GHC.Base.foldr (mapFB c f) n1 (GHC.Base.foldr (mapFB (:) g) n2 xs)
    = GHC.Base.foldr (mapFB c (f.g)) (GHC.Base.foldr (mapFB c f) n1 n2) xs
    #- 
But there is something fishy going on here: The foldr/nil rule is identical to a rule in the standard library! I should not have to add to my file that as I am proving things. So I knew that the GHC plugin, which I wrote to do these proofs, was doing something wrong, but I did not investigate for a while. I returned to this problem recetly, and with the efficient and quick help of Simon Peyton Jones, I learned what I was doing wrong.1 After fixing it, I could remove all the simple rules from the files with my proofs. And to my surprise, I could remove the intricate rule as well! So with this bug fixed, ghc-proofs is able to prove all the Functor, Applicative and Monad rules of the Succs functor without any additional rewrite rules, as you can see in the example file! (I still have to strategically place seqs in a few places.) That s great, isn t it! Yeah, sure. But having to introduce the rules at that point provided a very good narrative in my talk, so when I will give a similar talk next week in Pairs (actually, twice, first at the university and then at the Paris Haskell meetup, I will have to come up with a different example that calls for additional rewrite rules. In related news: Since the last blog post, ghc-proofs learned to interpret proof specifications like
applicative_law4 :: Succs (a -> b) -> a -> Proof
applicative_law4 u y = u <*> pure y
                   === pure ($ y) <*> u
where it previously only understood
applicative_law4 = (\ u (y::a) -> u <*> (pure y :: Succs a))
               === (\ u (y::a) -> pure ($ y) <*> u)
I am not sur if this should be uploaded to Hackage, but I invite you to play around with the GitHub version of ghc-proofs.

  1. In short: I did not initialize the simplifier with the right InScopeSet, so RULES about functions defined in the current module were not always applied, and I did not feed the eps_rules to the simplifier, which contains all the rules found in imported packages, such as base.

21 April 2017

Joachim Breitner: veggies: Haskell code generation from scratch

How hard it is to write a compiler for Haskell Core? Not too hard, actually! I wish we had a formally verified compiler for Haskell, or at least for GHC s intermediate language Core. Now formalizing that part of GHC itself seems to be far out of reach, with the many phases the code goes through (Core to STG to CMM to Assembly or LLVM) and optimizations happening at all of these phases and the many complicated details to the highly tuned GHC runtime (pointer tagging, support for concurrency and garbage collection).

Introducing Veggies So to make that goal of a formally verified compiler more feasible, I set out and implemented code generation from GHC s intermediate language Core to LLVM IR, with simplicity as the main design driving factor. You can find the result in the GitHub repository of veggies (the name derives from verifiable GHC ). If you clone that and run ./boot.sh some-directory, you will find that you can use the program some-directory/bin/veggies just like like you would use ghc. It comes with the full base library, so your favorite variant of HelloWorld might just compile and run. As of now, the code generation handles all the Core constructs (which is easy when you simply ignore all the types). It supports a good number of primitive operations, including pointers and arrays I implement these as need and has support for FFI calls into C.

Why you don't want to use Veggies Since the code generator was written with simplicity in mind, performance of the resulting code is abysmal: Everything is boxed, i.e. represented as pointer to some heap-allocated data, including unboxed integer values and unboxed tuples. This is very uniform and simplifies the code, but it is also slow, and because there is no garbage collection (and probably never will be for this project), will fill up your memory quickly. Also, the code is currently only supports 64bit architectures, and this is hard-coded in many places. There is no support for concurrency.

Why it might be interesting to you nevertheless So if it is not really usable to run programs with, should you care about it? Probably not, but maybe you do for one of these reasons:
  • You always wondered how a compiler for Haskell actually works, and reading through a little over a thousands lines of code is less daunting than reading through the 34k lines of code that is GHC s backend.
  • You have wacky ideas about Code generation for Haskell that you want to experiment with.
  • You have wacky ideas about Haskell that require special support in the backend, and want to prototype that.
  • You want to see how I use the GHC API to provide a ghc-like experience. (I copied GHC s Main.hs and inserted a few hooks, an approach I copied from GHCJS).
  • You want to learn about running Haskell programs efficiently, and starting from veggies, you can implement all the trick of the trade yourself and enjoy observing the speed-ups you get.
  • You want to compile Haskell code to some weird platform that is supported by LLVM, but where you for some reason cannot run GHC s runtime. (Because there are no threads and no garbage collection, the code generated by veggies does not require a runtime system.)
  • You want to formally verify Haskell code generation. Note that the code generator targets the same AST for LLVM IR that the vellvm2 project uses, so eventually, veggies can become a verified arrow in the top right corner map of the DeepSpec project.
So feel free to play around with veggies, and report any issues you have on the GitHub repository.

28 March 2017

Joachim Breitner: Birthday greetings communication behaviour

Randall Munroe recently mapped how he communicated with his social circle. As I got older recently, I had an opportunity to create a similar statistics that shows how people close to me chose to fulfil their social obligations (given the current geographic circumstances):
Communication variants

Communication variants

(Diagram created with the xkcd-font and using these two stackoverflow answers.) In related news: Heating 3 US cups of water to a boil takes 7 minutes and 40 seconds on one particular gas stove, but only 3 minutes and 50 seconds with an electric kettle, despite the 110V-induced limitation to 1.5kW. (Diagram updated on March 30, as the actual mail is slower than the other channels.)

7 February 2017

Joachim Breitner: Why prove programs equivalent when your compiler can do that for you?

Last week, while working on CodeWorld, via a sequence of yak shavings, I ended up creating a nicely small library that provides Control.Applicative.Succs, a new applicative functor. And because I am trying to keep my Haskell karma good1, I wanted to actually prove that my code fulfills the Applicative and Monad laws. This led me to inserted writing long comments into my code, filled with lines like this:
The second Applicative law:
  pure (.) <*> Succs u us <*> Succs v vs <*> Succs w ws
= Succs (.) [] <*> Succs u us <*> Succs v vs <*> Succs w ws
= Succs (u .) (map (.) us) <*> Succs v vs <*> Succs w ws
= Succs (u . v) (map ($v) (map (.) us) ++ map (u .) vs) <*> Succs w ws
= Succs (u . v) (map (($v).(.)) us ++ map (u .) vs) <*> Succs w ws
= Succs ((u . v) w) (map ($w) (map (($v).(.)) us ++ map (u .) vs) ++ map (u.v) ws)
= Succs ((u . v) w) (map (($w).($v).(.)) us ++ map (($w).(u.)) vs ++ map (u.v) ws)
= Succs (u (v w)) (map (\u -> u (v w)) us ++ map (\v -> u (v w)) vs ++ map (\w -> u (v w)) ws)
= Succs (u (v w)) (map ($(v w)) us ++ map u (map ($w) vs ++ map v ws))
= Succs u us <*> Succs (v w) (map ($w) vs ++ map v ws)
= Succs u us <*> (Succs v vs <*> Succs w ws)
Honk if you have done something like this before! I proved all the laws, but I was very unhappy. I have a PhD on something about Haskell and theorem proving. I have worked with Isabelle, Agda and Coq. Both Haskell and theorem proving is decades old. And yet, I sit here, and tediously write manual proofs by hand. Is this really the best we can do? Of course I could have taken my code, rewritten it in, say, Agda, and proved it correct there. But (right now) I don t care about Agda code. I care about my Haskell code! I don t want to write it twice, worry about copying mistakes and mismatchs in semantics, and have external proofs to maintain. Instead, I want to prove where I code, and have the proofs checked together with my code! Then it dawned to me that this is, to some extent, possible. The Haskell compiler comes with a sophisticated program transformation machinery, which is meant to simplify and optimize code. But it can also be used to prove Haskell expressions to be equivalent! The idea is simple: Take two expressions, run both through the compiler s simplifier, and check if the results are the same. If they are, then the expressions are, as far as the compiler is concerned, equivalent. A handful of hours later, I was able to write proof tasks like
app_law_2 = (\ a b (c::Succs a) -> pure (.) <*> a <*> b <*> c)
        === (\ a b c -> a <*> (b <*> c))
and others into my source file, and the compiler would tell me happily:
[1 of 1] Compiling Successors       ( Successors.hs, Successors.o )
GHC.Proof: Proving getCurrent_proof1  
GHC.Proof: Proving getCurrent_proof2  
GHC.Proof: Proving getCurrent_proof3  
GHC.Proof: Proving ap_star  
GHC.Proof: Proving getSuccs_proof1  
GHC.Proof: Proving getSuccs_proof2  
GHC.Proof: Proving getSuccs_proof3  
GHC.Proof: Proving app_law_1  
GHC.Proof: Proving app_law_2  
GHC.Proof: Proving app_law_3  
GHC.Proof: Proving app_law_4  
GHC.Proof: Proving monad_law_1  
GHC.Proof: Proving monad_law_2  
GHC.Proof: Proving monad_law_3  
GHC.Proof: Proving return_pure  
GHC.Proof proved 15 equalities
This is how I want to prove stuff about my code! Do you also want to prove stuff about your code? I packaged this up as a GHC plugin in the Haskell library ghc-proofs (not yet on Hackage). The README of the repository has a bit more detail on how to use this plugin, how it works, what its limitations are and where this is heading. This is still only a small step, but finally there is a step towards low threshold program equivalence proofs in Haskell.

  1. Or rather recover my karma after such abominations such as ghc-dup, seal-module or ghc-heap-view.

20 January 2017

Joachim Breitner: Global almost-constants for Haskell

More than five years ago I blogged about the configuration problem and a proposed solution for Haskell, which turned into some Template Haskell hacks in the seal-module package. With the new GHC proposal process in plase, I am suddenly much more inclined to write up my weird wishes for the Haskell language in proposal form, to make them more coherent, get feedback, and maybe (maybe) actually get them implemented. But even if the proposal is rejected it is still a nice forum to discuss these ideas. So I turned my Template Haskell hack into a proposed new syntactic feature. The idea is shamelessly stolen from Isabelle, including some of the keywords, and would allow you to write
context fixes progName in
  foo :: Maybe Int -> Either String Int
  foo Nothing  = Left $ progName ++ ": no number given"
  foo (Just i) = bar i
  bar :: Int -> Either String Int
  bar 0 = Left $ progName ++ ": zero no good"
  bar n = Right $ n + 1
instead of
foo :: String -> Maybe Int -> Either String Int
foo progName Nothing  = Left $ progName ++ ": no number given"
foo progName (Just i) = bar progName  i
bar :: String -> Int -> Either String Int
bar progName 0 = Left $ progName ++ ": zero no good"
bar progName n = Right $ n + 1
when you want to have an almost constant parameter. I am happy to get feedback at the GitHub pull request.

6 January 2017

Joachim Breitner: TikZ aesthetics

Every year since 2012, I typeset the problems and solutions for the German math event Tag der Mathematik, which is organized by the Zentrum f r Mathematik and reaches 1600 students from various parts of Germany. For that, I often reach to the LaTeX drawing package TikZ, and I really like the sober aesthetics of a nicely done TikZ drawing. So mostly for my own enjoyment, I collect the prettiest here. On a global scale they are still rather mundane, and for really impressive and educating examples, I recommend the TikZ Gallery.

23 November 2016

Joachim Breitner: microG on Jolla

I am a incorrigibly in picking non-mainstream, open smartphones, and then struggling hard. Back then in 2008, I tried to use the OpenMoko FreeRunner, but eventually gave up because of hardware glitches and reverted to my good old Siemens S35. It was not that I would not be willing to put up with inconveniences, but as soon as it makes live more difficult for the people I communicate with, it becomes hard to sustain. Two years ago I tried again, and got myself a Jolla phone, running Sailfish OS. Things are much nicer now: The hardware is mature, battery live is good, and the Android compatibility layer enables me to run many important apps that are hard to replace, especially the Deutsche Bahn Navigator and various messengers, namely Telegram, Facebook Messenger, Threema and GroupMe. Some apps that require Google Play Services, which provides a bunch of common tasks and usually comes with the Google Play store would not run on my phone, as Google Play is not supported on Sailfish OS. So far, the most annoying ones of that sort were Uber and Lyft, making me pay for expensive taxis when others would ride cheaper, but I can live with that. I tried to install Google Play Services from shady sources, but it would regularly crash.

Signal on Jolla Now in Philadelphia, people urged me to use the Signal messenger, and I was convinced by its support for good end-to-end crypto, while still supporting offline messages and allowing me to switch from my phone to my desktop and back during a conversation. The official Signal app uses Google Cloud Messaging (GCM, part of Google Play Services) to get push updates about new posts, and while I do not oppose this use of Google services (it really is just a ping without any metadata), this is a problem on Sailfish OS. Luckily, the Signal client is open source, and someone created a LibreSignal edition that replaced the use of GCM with websockets, and indeed, this worked on my phone, and I could communicate. Things were not ideal, though: I would often have to restart the app to get newly received messages; messages that I send via Signal Desktop would often not show up on the phone and, most severe, basically after every three messages, sending more messages from Desktop would stop working for my correspondents, which freaked them out. (Strangely it continued working from their phone app, so we coped for a while.) So again, my choice of non-standard devices causes inconveniences to others. This, and the fact that the original authors of Signal and the maintainers of LibreSignal got into a fight that ended LibreSignal discontinued, meant that I have to change something about this situation. I was almost ready to give in and get myself a Samsung S7 or something boring of the sort, but then I decided to tackle this issue once more, following some of the more obscure instructions out there, trying to get vanilla Signal working on my phone. About a day later, I got it, and this is how I did it.

microG So I need Google Play Services somehow, but installing the real thing did not seem to be very promising (I tried, and regularly got pop-ups telling me that Play Services has crashed.) But I found some references to a project called microG , which is an independent re-implementation of (some of) of the play services, in particular including GCM. Installing microG itself was easy, as you can add their repository to F-Droid. I installed the core services, the services framework and the fake store apps. If this had been all that was to do, things would be easy!

Play Store detection work arounds But Signal would still complain about the lack of Google Play Services. It asks Android if an app with a certain name is installed, and would refuse to work if this app does not exist. For some reason, the microG apps cannot just have the names of the real Google apps. There seem to be two ways of working around this: Patching Signal, or enabling Signature Spoofing. The initially most promising instructions (which are in a README in a tarball on a fishy file hoster linked from an answer on the Jolla support forum ) suggested patching Signal, and actually came both with a version of an app called Lucky Patcher as well as a patched Android package, but both about two years old. I tried a recent version of the Lucky Patcher, but it failed to patch the current version of Signal.

Signature Spoofing So on to Signature Spoofing. This is a feature of some non-standard Android builds that allow apps (such as microG) to fake the existence of other apps (the Play Store), and is recommended by the microG project. Sailfish OS s Android compatibility layer Alien Dalvik does not support it out of the box, but there is a tool tingle that adds this feature to existing Android systems. One just has to get the /system/framework/framework.jar file, put it into the input folder of this project, run python main.py, select 2, and copy the framework.jar from output/ back. Great.

Deodexing Alien Dalvik Only that it only works on deodexed files. I did not know anything about odexed Android Java classes (and did not really want to know), but there was not way around. Following this explanation I gathered that one finds files foo.odex in the Android system folder, runs some tool on them to create a classes.dex file, and adds that to the corresponding foo.jar or foo.apk file, copies this back to the phone and deletes the foo.odex file. The annoying this is that one does not only have to do it for framework.jar in order to please tingle, because if one does it to one odex file, one has to do to all! It seems that for people using Windows, the Universal Deodexer V5 seems to be a convenient tool, but I had to go more manually. So I first fetched smali , compiled it using ./gradlew build. Then I fetched the folders /opt/alien/system/framework and /opt/alien/system/app from the phone (e.g. using scp). Keep a backup of these in case something breaks. Then I ran these commands (disclaimer: I fetched these from my bash history and slightly cleaned them up. This is not a fire-and-forget script! Use it when you know what it and you are doing):
cd framework
for file in *.odex
do
  java -jar ~/build/smali/baksmali/build/libs/baksmali.jar deodex $file -o out
  java -jar ~/build/smali/smali/build/libs/smali.jar a out -o classes.dex
  zip -u $(basename $file .odex).jar classes.dex
  rm -rf out classes.dex $file
done
cd ..
cd app
for file in *.odex
do
  java -jar ~/build/smali/baksmali/build/libs/baksmali.jar deodex -d ../framework $file -o out
  java -jar ~/build/smali/smali/build/libs/smali.jar a out -o classes.dex
  zip -u $(basename $file .odex).apk classes.dex
  rm -rf out classes.dex $file
done
cd ..
The resulting framework.jar can now be patched with tingle:
mv framework/framework.jar ~/build/tingle/input
cd ~/build/tingle
./main.py
# select 2
cd -
mv ~/build/tingle/output/framework.jar framework/framework.jar
Now I copy these framework and app folders back on my phone, and restart Dalvik:
devel-su systemctl restart aliendalvik.service
It might start a bit slower than usually, but eventually, all the Android apps should work as before. The final bit that was missing in my case was that I had to reinstall Signal: If it is installed before microG is installed, it does not get permission to use GCM, and when it tries (while registering: After generating the keys) it just crashes. I copied /data/data/org.thoughtcrime.secretsms/ before removing Signal and moved it back after (with cp -a to preserve permissions) so that I could keep my history. And now, it seems, vanilla Signal is working just fine on my Jolla phone!

What s missing Am I completely happy with Signal? No! An important feature that it is lacking is a way to get out all data (message history including media files) in a file format that can be read without Signal; e.g. YAML files or clean HTML code. I do want to be able to re-read some of the more interesting conversations when I am 74 or 75, and I doubt that there will be a Signal App, or even Android, then. I hope that this becomes available in time, maybe in the Desktop version. I would also hope that pidgin gets support to the Signal protocol, so that I conveniently use one program for all my messaging needs on the desktop. Finally it would be nice if my Signal identity was less tied to one phone number. I have a German and a US phone number, and would want to be reachable under both on all my clients. (If you want to contact me on Signal, use my US phone number.)

Alternatives Could I have avoided this hassle by simply convincing people to use something other than Signal? Tricky, at the moment. Telegram (which works super reliable for me, and has a pidgin plugin) has dubious crypto and does not support crypto while using multiple clients. Threema has no desktop client that I know of. OTR on top of Jabber does not support offline messages. So nothing great seems to exist right now. In the long run, the best bet seems to be OMEMO (which is, in essence, the Signal protocol) on top of Jabber. It is currently supported by one Android Jabber client (Conversations) and one Desktop application (gajim, via a plugin). I should keep an eye on pidgin support for OMEMO and other development around this.

26 October 2016

Joachim Breitner: Showcasing Applicative

My plan for this week s lecture of the CIS 194 Haskell course at the University of Pennsylvania is to dwell a bit on the concept of Functor, Applicative and Monad, and to highlight the value of the Applicative abstraction. I quite like the example that I came up with, so I want to share it here. In the interest of long-term archival and stand-alone presentation, I include all the material in this post.1

Imports In case you want to follow along, start with these imports:
import Data.Char
import Data.Maybe
import Data.List
import System.Environment
import System.IO
import System.Exit

The parser The starting point for this exercise is a fairly standard parser-combinator monad, which happens to be the result of the student s homework from last week:
newtype Parser a = P (String -> Maybe (a, String))
runParser :: Parser t -> String -> Maybe (t, String)
runParser (P p) = p
parse :: Parser a -> String -> Maybe a
parse p input = case runParser p input of
    Just (result, "") -> Just result
    _ -> Nothing -- handles both no result and leftover input
noParserP :: Parser a
noParserP = P (\_ -> Nothing)
pureParserP :: a -> Parser a
pureParserP x = P (\input -> Just (x,input))
instance Functor Parser where
    fmap f p = P $ \input -> do
	(x, rest) <- runParser p input
	return (f x, rest)
instance Applicative Parser where
    pure = pureParserP
    p1 <*> p2 = P $ \input -> do
        (f, rest1) <- runParser p1 input
        (x, rest2) <- runParser p2 rest1
        return (f x, rest2)
instance Monad Parser where
    return = pure
    p1 >>= k = P $ \input -> do
        (x, rest1) <- runParser p1 input
        runParser (k x) rest1
anyCharP :: Parser Char
anyCharP = P $ \input -> case input of
    (c:rest) -> Just (c, rest)
    []       -> Nothing
charP :: Char -> Parser ()
charP c = do
    c' <- anyCharP
    if c == c' then return ()
               else noParserP
anyCharButP :: Char -> Parser Char
anyCharButP c = do
    c' <- anyCharP
    if c /= c' then return c'
               else noParserP
letterOrDigitP :: Parser Char
letterOrDigitP = do
    c <- anyCharP
    if isAlphaNum c then return c else noParserP
orElseP :: Parser a -> Parser a -> Parser a
orElseP p1 p2 = P $ \input -> case runParser p1 input of
    Just r -> Just r
    Nothing -> runParser p2 input
manyP :: Parser a -> Parser [a]
manyP p = (pure (:) <*> p <*> manyP p)  orElseP  pure []
many1P :: Parser a -> Parser [a]
many1P p = pure (:) <*> p <*> manyP p
sepByP :: Parser a -> Parser () -> Parser [a]
sepByP p1 p2 = (pure (:) <*> p1 <*> (manyP (p2 *> p1)))  orElseP  pure []
A parser using this library for, for example, CSV files could take this form:
parseCSVP :: Parser [[String]]
parseCSVP = manyP parseLine
  where
    parseLine = parseCell  sepByP  charP ',' <* charP '\n'
    parseCell = do
        charP '"'
        content <- manyP (anyCharButP '"')
        charP '"'
        return content

We want EBNF Often when we write a parser for a file format, we might also want to have a formal specification of the format. A common form for such a specification is EBNF. This might look as follows, for a CSV file:
cell = '"',  not-quote , '"';
line = (cell,  ',', cell    ''), newline;
csv  =  line ;
It is straightforward to create a Haskell data type to represent an EBNF syntax description. Here is a simple EBNF library (data type and pretty-printer) for your convenience:
data RHS
  = Terminal String
    NonTerminal String
    Choice RHS RHS
    Sequence RHS RHS
    Optional RHS
    Repetition RHS
  deriving (Show, Eq)
ppRHS :: RHS -> String
ppRHS = go 0
  where
    go _ (Terminal s)     = surround "'" "'" $ concatMap quote s
    go _ (NonTerminal s)  = s
    go a (Choice x1 x2)   = p a 1 $ go 1 x1 ++ "   " ++ go 1 x2
    go a (Sequence x1 x2) = p a 2 $ go 2 x1 ++ ", "  ++ go 2 x2
    go _ (Optional x)     = surround "[" "]" $ go 0 x
    go _ (Repetition x)   = surround " " " " $ go 0 x
    surround c1 c2 x = c1 ++ x ++ c2
    p a n   a > n     = surround "(" ")"
            otherwise = id
    quote '\'' = "\\'"
    quote '\\' = "\\\\"
    quote c    = [c]
type Production = (String, RHS)
type BNF = [Production]
ppBNF :: BNF -> String
ppBNF = unlines . map (\(i,rhs) -> i ++ " = " ++ ppRHS rhs ++ ";")

Code to produce EBNF We had a good time writing combinators that create complex parsers from primitive pieces. Let us do the same for EBNF grammars. We could simply work on the RHS type directly, but we can do something more nifty: We create a data type that keeps track, via a phantom type parameter, of what Haskell type the given EBNF syntax is the specification:
newtype Grammar a = G RHS
ppGrammar :: Grammar a -> String
ppGrammar (G rhs) = ppRHS rhs
So a value of type Grammar t is a description of the textual representation of the Haskell type t. Here is one simple example:
anyCharG :: Grammar Char
anyCharG = G (NonTerminal "char")
Here is another one. This one does not describe any interesting Haskell type, but is useful when spelling out the special characters in the syntax described by the grammar:
charG :: Char -> Grammar ()
charG c = G (Terminal [c])
A combinator that creates new grammar from two existing grammars:
orElseG :: Grammar a -> Grammar a -> Grammar a
orElseG (G rhs1) (G rhs2) = G (Choice rhs1 rhs2)
We want the convenience of our well-known type classes in order to combine these values some more:
instance Functor Grammar where
    fmap _ (G rhs) = G rhs
instance Applicative Grammar where
    pure x = G (Terminal "")
    (G rhs1) <*> (G rhs2) = G (Sequence rhs1 rhs2)
Note how the Functor instance does not actually use the function. How should it? There are no values inside a Grammar! We cannot define a Monad instance for Grammar: We would start with (G rhs1) >>= k = , but there is simply no way of getting a value of type a that we can feed to k. So we will do without a Monad instance. This is interesting, and we will come back to that later. Like with the parser, we can now begin to build on the primitive example to build more complicated combinators:
manyG :: Grammar a -> Grammar [a]
manyG p = (pure (:) <*> p <*> manyG p)  orElseG  pure []
many1G :: Grammar a -> Grammar [a]
many1G p = pure (:) <*> p <*> manyG p
sepByG :: Grammar a -> Grammar () -> Grammar [a]
sepByG p1 p2 = ((:) <$> p1 <*> (manyG (p2 *> p1)))  orElseG  pure []
Let us run a small example:
dottedWordsG :: Grammar [String]
dottedWordsG = many1G (manyG anyCharG <* charG '.')
*Main> putStrLn $ ppGrammar dottedWordsG
'', ('', char, ('', char, ('', char, ('', char, ('', char, ('',  
Oh my, that is not good. Looks like the recursion in manyG does not work well, so we need to avoid that. But anyways we want to be explicit in the EBNF grammars about where something can be repeated, so let us just make many a primitive:
manyG :: Grammar a -> Grammar [a]
manyG (G rhs) = G (Repetition rhs)
With this definition, we already get a simple grammar for dottedWordsG:
*Main> putStrLn $ ppGrammar dottedWordsG
'',  char , '.',  char , '.' 
This already looks like a proper EBNF grammar. One thing that is not nice about it is that there is an empty string ('') in a sequence ( , ). We do not want that. Why is it there in the first place? Because our Applicative instance is not lawful! Remember that pure id <*> g == g should hold. One way to achieve that is to improve the Applicative instance to optimize this case away:
instance Applicative Grammar where
    pure x = G (Terminal "")
    G (Terminal "") <*> G rhs2 = G rhs2
    G rhs1 <*> G (Terminal "") = G rhs1
    (G rhs1) <*> (G rhs2) = G (Sequence rhs1 rhs2)
	 
Now we get what we want:
*Main> putStrLn $ ppGrammar dottedWordsG
 char , '.',  char , '.' 
Remember our parser for CSV files above? Let me repeat it here, this time using only Applicative combinators, i.e. avoiding (>>=), (>>), return and do-notation:
parseCSVP :: Parser [[String]]
parseCSVP = manyP parseLine
  where
    parseLine = parseCell  sepByP  charG ',' <* charP '\n'
    parseCell = charP '"' *> manyP (anyCharButP '"') <* charP '"'
And now we try to rewrite the code to produce Grammar instead of Parser. This is straightforward with the exception of anyCharButP. The parser code for that inherently monadic, and we just do not have a monad instance. So we work around the issue by making that a primitive grammar, i.e. introducing a non-terminal in the EBNF without a production rule pretty much like we did for anyCharG:
primitiveG :: String -> Grammar a
primitiveG s = G (NonTerminal s)
parseCSVG :: Grammar [[String]]
parseCSVG = manyG parseLine
  where
    parseLine = parseCell  sepByG  charG ',' <* charG '\n'
    parseCell = charG '"' *> manyG (primitiveG "not-quote") <* charG '"'
Of course the names parse are not quite right any more, but let us just leave that for now. Here is the result:
*Main> putStrLn $ ppGrammar parseCSVG
 ('"',  not-quote , '"',  ',', '"',  not-quote , '"'    ''), '
' 
The line break is weird. We do not really want newlines in the grammar. So let us make that primitive as well, and replace charG '\n' with newlineG:
newlineG :: Grammar ()
newlineG = primitiveG "newline"
Now we get
*Main> putStrLn $ ppGrammar parseCSVG
 ('"',  not-quote , '"',  ',', '"',  not-quote , '"'    ''), newline 
which is nice and correct, but still not quite the easily readable EBNF that we saw further up.

Code to produce EBNF, with productions We currently let our grammars produce only the right-hand side of one EBNF production, but really, we want to produce a RHS that may refer to other productions. So let us change the type accordingly:
newtype Grammar a = G (BNF, RHS)
runGrammer :: String -> Grammar a -> BNF
runGrammer main (G (prods, rhs)) = prods ++ [(main, rhs)]
ppGrammar :: String -> Grammar a -> String
ppGrammar main g = ppBNF $ runGrammer main g
Now we have to adjust all our primitive combinators (but not the derived ones!):
charG :: Char -> Grammar ()
charG c = G ([], Terminal [c])
anyCharG :: Grammar Char
anyCharG = G ([], NonTerminal "char")
manyG :: Grammar a -> Grammar [a]
manyG (G (prods, rhs)) = G (prods, Repetition rhs)
mergeProds :: [Production] -> [Production] -> [Production]
mergeProds prods1 prods2 = nub $ prods1 ++ prods2
orElseG :: Grammar a -> Grammar a -> Grammar a
orElseG (G (prods1, rhs1)) (G (prods2, rhs2))
    = G (mergeProds prods1 prods2, Choice rhs1 rhs2)
instance Functor Grammar where
    fmap _ (G bnf) = G bnf
instance Applicative Grammar where
    pure x = G ([], Terminal "")
    G (prods1, Terminal "") <*> G (prods2, rhs2)
        = G (mergeProds prods1 prods2, rhs2)
    G (prods1, rhs1) <*> G (prods2, Terminal "")
        = G (mergeProds prods1 prods2, rhs1)
    G (prods1, rhs1) <*> G (prods2, rhs2)
        = G (mergeProds prods1 prods2, Sequence rhs1 rhs2)
primitiveG :: String -> Grammar a
primitiveG s = G (NonTerminal s)
The use of nub when combining productions removes duplicates that might be used in different parts of the grammar. Not efficient, but good enough for now. Did we gain anything? Not yet:
*Main> putStr $ ppGrammar "csv" (parseCSVG)
csv =  ('"',  not-quote , '"',  ',', '"',  not-quote , '"'    ''), newline ;
But we can now introduce a function that lets us tell the system where to give names to a piece of grammar:
nonTerminal :: String -> Grammar a -> Grammar a
nonTerminal name (G (prods, rhs))
  = G (prods ++ [(name, rhs)], NonTerminal name)
Ample use of this in parseCSVG yields the desired result:
parseCSVG :: Grammar [[String]]
parseCSVG = manyG parseLine
  where
    parseLine = nonTerminal "line" $
        parseCell  sepByG  charG ',' <* newline
    parseCell = nonTerminal "cell" $
        charG '"' *> manyG (primitiveG "not-quote") <* charG '"
*Main> putStr $ ppGrammar "csv" (parseCSVG)
cell = '"',  not-quote , '"';
line = (cell,  ',', cell    ''), newline;
csv =  line ;
This is great!

Unifying parsing and grammar-generating Note how simliar parseCSVG and parseCSVP are! Would it not be great if we could implement that functionality only once, and get both a parser and a grammar description out of it? This way, the two would never be out of sync! And surely this must be possible. The tool to reach for is of course to define a type class that abstracts over the parts where Parser and Grammer differ. So we have to identify all functions that are primitive in one of the two worlds, and turn them into type class methods. This includes char and orElse. It includes many, too: Although manyP is not primitive, manyG is. It also includes nonTerminal, which does not exist in the world of parsers (yet), but we need it for the grammars. The primitiveG function is tricky. We use it in grammars when the code that we might use while parsing is not expressible as a grammar. So the solution is to let it take two arguments: A String, when used as a descriptive non-terminal in a grammar, and a Parser a, used in the parsing code. Finally, the type classes that we except, Applicative (and thus Functor), are added as constraints on our type class:
class Applicative f => Descr f where
    char :: Char -> f ()
    many :: f a -> f [a]
    orElse :: f a -> f a -> f a
    primitive :: String -> Parser a -> f a
    nonTerminal :: String -> f a -> f a
The instances are easily written:
instance Descr Parser where
    char = charP
    many = manyP
    orElse = orElseP
    primitive _ p = p
    nonTerminal _ p = p
instance Descr Grammar where
    char = charG
    many = manyG
    orElse = orElseG
    primitive s _ = primitiveG s
    nonTerminal s g = nonTerminal s g
And we can now take the derived definitions, of which so far we had two copies, and define them once and for all:
many1 :: Descr f => f a -> f [a]
many1 p = pure (:) <*> p <*> many p
anyChar :: Descr f => f Char
anyChar = primitive "char" anyCharP
dottedWords :: Descr f => f [String]
dottedWords = many1 (many anyChar <* char '.')
sepBy :: Descr f => f a -> f () -> f [a]
sepBy p1 p2 = ((:) <$> p1 <*> (many (p2 *> p1)))  orElse  pure []
newline :: Descr f => f ()
newline = primitive "newline" (charP '\n')
And thus we now have our CSV parser/grammar generator:
parseCSV :: Descr f => f [[String]]
parseCSV = many parseLine
  where
    parseLine = nonTerminal "line" $
        parseCell  sepBy  char ',' <* newline
    parseCell = nonTerminal "cell" $
        char '"' *> many (primitive "not-quote" (anyCharButP '"')) <* char '"'
We can now use this definition both to parse and to generate grammars:
*Main> putStr $ ppGrammar2 "csv" (parseCSV)
cell = '"',  not-quote , '"';
line = (cell,  ',', cell    ''), newline;
csv =  line ;
*Main> parse parseCSV "\"ab\",\"cd\"\n\"\",\"de\"\n\n"
Just [["ab","cd"],["","de"],[]]

The INI file parser and grammar As a final exercise, let us transform the INI file parser into a combined thing. Here is the parser (another artifact of last week s homework) again using applicative style2:
parseINIP :: Parser INIFile
parseINIP = many1P parseSection
  where
    parseSection =
        (,) <$  charP '['
            <*> parseIdent
            <*  charP ']'
            <*  charP '\n'
            <*> (catMaybes <$> manyP parseLine)
    parseIdent = many1P letterOrDigitP
    parseLine = parseDecl  orElseP  parseComment  orElseP  parseEmpty
    parseDecl = Just <$> (
        (,) <*> parseIdent
            <*  manyP (charP ' ')
            <*  charP '='
            <*  manyP (charP ' ')
            <*> many1P (anyCharButP '\n')
            <*  charP '\n')
    parseComment =
        Nothing <$ charP '#'
                <* many1P (anyCharButP '\n')
                <* charP '\n'
    parseEmpty = Nothing <$ charP '\n'
Transforming that to a generic description is quite straightforward. We use primitive again to wrap letterOrDigitP:
descrINI :: Descr f => f INIFile
descrINI = many1 parseSection
  where
    parseSection =
        (,) <*  char '['
            <*> parseIdent
            <*  char ']'
            <*  newline
            <*> (catMaybes <$> many parseLine)
    parseIdent = many1 (primitive "alphanum" letterOrDigitP)
    parseLine = parseDecl  orElse  parseComment  orElse  parseEmpty
    parseDecl = Just <$> (
        (,) <*> parseIdent
            <*  many (char ' ')
            <*  char '='
            <*  many (char ' ')
            <*> many1 (primitive "non-newline" (anyCharButP '\n'))
	    <*  newline)
    parseComment =
        Nothing <$ char '#'
                <* many1 (primitive "non-newline" (anyCharButP '\n'))
		<* newline
    parseEmpty = Nothing <$ newline
This yields this not very helpful grammar (abbreviated here):
*Main> putStr $ ppGrammar2 "ini" descrINI
ini = '[', alphanum,  alphanum , ']', newline,  alphanum,  alphanum ,  ' ' 
But with a few uses of nonTerminal, we get something really nice:
descrINI :: Descr f => f INIFile
descrINI = many1 parseSection
  where
    parseSection = nonTerminal "section" $
        (,) <$  char '['
            <*> parseIdent
            <*  char ']'
            <*  newline
            <*> (catMaybes <$> many parseLine)
    parseIdent = nonTerminal "identifier" $
        many1 (primitive "alphanum" letterOrDigitP)
    parseLine = nonTerminal "line" $
        parseDecl  orElse  parseComment  orElse  parseEmpty
    parseDecl = nonTerminal "declaration" $ Just <$> (
        (,) <$> parseIdent
            <*  spaces
            <*  char '='
            <*  spaces
            <*> remainder)
    parseComment = nonTerminal "comment" $
        Nothing <$ char '#' <* remainder
    remainder = nonTerminal "line-remainder" $
        many1 (primitive "non-newline" (anyCharButP '\n')) <* newline
    parseEmpty = Nothing <$ newline
    spaces = nonTerminal "spaces" $ many (char ' ')
*Main> putStr $ ppGrammar "ini" descrINI
identifier = alphanum,  alphanum ;
spaces =  ' ' ;
line-remainder = non-newline,  non-newline , newline;
declaration = identifier, spaces, '=', spaces, line-remainder;
comment = '#', line-remainder;
line = declaration   comment   newline;
section = '[', identifier, ']', newline,  line ;
ini = section,  section ;

Recursion (variant 1) What if we want to write a parser/grammar-generator that is able to generate the following grammar, which describes terms that are additions and multiplications of natural numbers:
const = digit,  digit ;
spaces =  ' '   newline ;
atom = const   '(', spaces, expr, spaces, ')', spaces;
mult = atom,  spaces, '*', spaces, atom , spaces;
plus = mult,  spaces, '+', spaces, mult , spaces;
expr = plus;
The production of expr is recursive (via plus, mult, atom). We have seen above that simply defining a Grammar a recursively does not go well. One solution is to add a new combinator for explicit recursion, which replaces nonTerminal in the method:
class Applicative f => Descr f where
     
    recNonTerminal :: String -> (f a -> f a) -> f a
instance Descr Parser where
     
    recNonTerminal _ p = let r = p r in r
instance Descr Grammar where
     
    recNonTerminal = recNonTerminalG
recNonTerminalG :: String -> (Grammar a -> Grammar a) -> Grammar a
recNonTerminalG name f =
    let G (prods, rhs) = f (G ([], NonTerminal name))
    in G (prods ++ [(name, rhs)], NonTerminal name)
nonTerminal :: Descr f => String -> f a -> f a
nonTerminal name p = recNonTerminal name (const p)
runGrammer :: String -> Grammar a -> BNF
runGrammer main (G (prods, NonTerminal nt))   main == nt = prods
runGrammer main (G (prods, rhs)) = prods ++ [(main, rhs)]
The change in runGrammer avoids adding a pointless expr = expr production to the output. This lets us define a parser/grammar-generator for the arithmetic expressions given above:
data Expr = Plus Expr Expr   Mult Expr Expr   Const Integer
    deriving Show
mkPlus :: Expr -> [Expr] -> Expr
mkPlus = foldl Plus
mkMult :: Expr -> [Expr] -> Expr
mkMult = foldl Mult
parseExpr :: Descr f => f Expr
parseExpr = recNonTerminal "expr" $ \ exp ->
    ePlus exp
ePlus :: Descr f => f Expr -> f Expr
ePlus exp = nonTerminal "plus" $
    mkPlus <$> eMult exp
           <*> many (spaces *> char '+' *> spaces *> eMult exp)
           <*  spaces
eMult :: Descr f => f Expr -> f Expr
eMult exp = nonTerminal "mult" $
    mkPlus <$> eAtom exp
           <*> many (spaces *> char '*' *> spaces *> eAtom exp)
           <*  spaces
eAtom :: Descr f => f Expr -> f Expr
eAtom exp = nonTerminal "atom" $
    aConst  orElse  eParens exp
aConst :: Descr f => f Expr
aConst = nonTerminal "const" $ Const . read <$> many1 digit
eParens :: Descr f => f a -> f a
eParens inner =
    id <$  char '('
       <*  spaces
       <*> inner
       <*  spaces
       <*  char ')'
       <*  spaces
And indeed, this works:
*Main> putStr $ ppGrammar "expr" parseExpr
const = digit,  digit ;
spaces =  ' '   newline ;
atom = const   '(', spaces, expr, spaces, ')', spaces;
mult = atom,  spaces, '*', spaces, atom , spaces;
plus = mult,  spaces, '+', spaces, mult , spaces;
expr = plus;

Recursion (variant 2) Interestingly, there is another solution to this problem, which avoids introducing recNonTerminal and explicitly passing around the recursive call (i.e. the exp in the example). To implement that we have to adjust our Grammar type as follows:
newtype Grammar a = G ([String] -> (BNF, RHS))
The idea is that the list of strings is those non-terminals that we are currently defining. So in nonTerminal, we check if the non-terminal to be introduced is currently in the process of being defined, and then simply ignore the body. This way, the recursion is stopped automatically:
nonTerminalG :: String -> (Grammar a) -> Grammar a
nonTerminalG name (G g) = G $ \seen ->
    if name  elem  seen
    then ([], NonTerminal name)
    else let (prods, rhs) = g (name : seen)
         in (prods ++ [(name, rhs)], NonTerminal name)
After adjusting the other primitives of Grammar (including the Functor and Applicative instances, wich now again have nonTerminal) to type-check again, we observe that this parser/grammar generator for expressions, with genuine recursion, works now:
parseExp :: Descr f => f Expr
parseExp = nonTerminal "expr" $
    ePlus
ePlus :: Descr f => f Expr
ePlus = nonTerminal "plus" $
    mkPlus <$> eMult
           <*> many (spaces *> char '+' *> spaces *> eMult)
           <*  spaces
eMult :: Descr f => f Expr
eMult = nonTerminal "mult" $
    mkPlus <$> eAtom
           <*> many (spaces *> char '*' *> spaces *> eAtom)
           <*  spaces
eAtom :: Descr f => f Expr
eAtom = nonTerminal "atom" $
    aConst  orElse  eParens parseExp
Note that the recursion is only going to work if there is at least one call to nonTerminal somewhere around the recursive calls. We still cannot implement many as naively as above.

Homework If you want to play more with this: The homework is to define a parser/grammar-generator for EBNF itself, as specified in this variant:
identifier = letter,  letter   digit   '-' ;
spaces =  ' '   newline ;
quoted-char = non-quote-or-backslash   '\\', '\\'   '\\', '\'';
terminal = '\'',  quoted-char , '\'', spaces;
non-terminal = identifier, spaces;
option = '[', spaces, rhs, spaces, ']', spaces;
repetition = ' ', spaces, rhs, spaces, ' ', spaces;
group = '(', spaces, rhs, spaces, ')', spaces;
atom = terminal   non-terminal   option   repetition   group;
sequence = atom,  spaces, ',', spaces, atom , spaces;
choice = sequence,  spaces, ' ', spaces, sequence , spaces;
rhs = choice;
production = identifier, spaces, '=', spaces, rhs, ';', spaces;
bnf = production,  production ;
This grammar is set up so that the precedence of , and is correctly implemented: a , b c will parse as (a, b) c. In this syntax for BNF, terminal characters are quoted, i.e. inside ' ', a ' is replaced by \' and a \ is replaced by \\ this is done by the function quote in ppRHS. If you do this, you should able to round-trip with the pretty-printer, i.e. parse back what it wrote:
*Main> let bnf1 = runGrammer "expr" parseExpr
*Main> let bnf2 = runGrammer "expr" parseBNF
*Main> let f = Data.Maybe.fromJust . parse parseBNF. ppBNF
*Main> f bnf1 == bnf1
True
*Main> f bnf2 == bnf2
True
The last line is quite meta: We are using parseBNF as a parser on the pretty-printed grammar produced from interpreting parseBNF as a grammar.

Conclusion We have again seen an example of the excellent support for abstraction in Haskell: Being able to define so very different things such as a parser and a grammar description with the same code is great. Type classes helped us here. Note that it was crucial that our combined parser/grammars are only able to use the methods of Applicative, and not Monad. Applicative is less powerful, so by giving less power to the user of our Descr interface, the other side, i.e. the implementation, can be more powerful. The reason why Applicative is ok, but Monad is not, is that in Applicative, the results do not affect the shape of the computation, whereas in Monad, the whole point of the bind operator (>>=) is that the result of the computation is used to decide the next computation. And while this is perfectly fine for a parser, it just makes no sense for a grammar generator, where there simply are no values around! We have also seen that a phantom type, namely the parameter of Grammar, can be useful, as it lets the type system make sure we do not write nonsense. For example, the type of orElseG ensures that both grammars that are combined here indeed describe something of the same type.

  1. It seems to be the week of applicative-appraising blog posts: Brent has posted a nice piece about enumerations using Applicative yesterday.
  2. I like how in this alignment of <*> and <* the > point out where the arguments are that are being passed to the function on the left.

8 October 2016

Joachim Breitner: T430s T460s

Earlier this week, I finally got my new machine that came with my new position at the University of Pennsylvania: A shiny Thinkpad T460s that now replaces my T430s. (Yes, there is a pattern. It continues with T400 and T41p.) I decided to re-install my Debian system from scratch and copy over only the home directory a bit of purification does not hurt. This blog post contains some random notes that might be useful to someone or alternative where I hope someone can tell me how to fix and improve things.

Installation The installation (using debian-installer from a USB drive) went mostly smooth, including LVM on an encrypted partition. Unfortunately, it did not set up grub correctly for the UEFI system to boot, so I had to jump through some hoops (using the grub on the USB drive to manually boot into the installed system, and installing grub-efi from there) until the system actually came up.

High-resolution display This laptop has a 2560 1440 high resolution display. Modern desktop environments like GNOME supposedly handle that quite nicely, but for reasons explained in an earlier post, I do not use a desktop envrionment but have a minimalistic setup based on Xmonad. I managed to get a decent setup now, by turning lots of manual knobs:
  • For the linux console, setting
    FONTFACE="Terminus"
    FONTSIZE="12x24"
    in /etc/default/console-setup yielded good results.
  • For the few GTK-2 applications that I am still running, I set
    gtk-font-name="Sans 16"
    in ~/.gtkrc-2.0. Similarly, for GTK-3 I have
    [Settings]
    gtk-font-name = Sans 16
    in ~/.config/gtk-3.0/settings.ini.
  • Programs like gnome-terminal, Evolution and hexchat refer to the System default document font and System default monospace font . I remember that it was possible to configure these in the GNOME control center, but I could not find any way of configuring these using command line tools, so I resorted to manually setting the font for these. With the help from Alexandre Franke I figured out that the magic incarnation here is:
    gsettings set org.gnome.desktop.interface monospace-font-name 'Monospace 16'
    gsettings set org.gnome.desktop.interface document-font-name 'Serif 16'
    gsettings set org.gnome.desktop.interface font-name 'Sans 16'
  • Firefox seemed to have picked up these settings for the UI, so that was good. To make web pages readable, I set layout.css.devPixelsPerPx to 1.5 in about:config.
  • GVim has set guifont=Monospace\ 16 in ~/.vimrc. The toolbar is tiny, but I hardly use it anyways.
  • Setting the font of Xmonad prompts requires the sytax
    , font = "xft:Sans:size=16"
    Speaking about Xmonad prompts: Check out the XMonad.Prompt.Unicode module that I have been using for years and recently submitted upstream.
  • I launch Chromium (or rather the desktop applications that I use that happen to be Chrome apps) with the parameter --force-device-scale-factor=1.5.
  • Libreoffice seems to be best configured by running xrandr --dpi 194 before hand. This seems also to be read by Firefox, doubling the effect of the font size in the gtk settings, which is annoying. Luckily I do not work with Libreoffice often, so for now I ll just set that manually when needed.
I am not quite satisfied. I have the impression that the 16 point size font, e.g. in Evolution, is not really pretty, so I am happy to take suggestions here. I found the ArchWiki page on HiDPI very useful here.

Trackpoint and Touchpad One reason for me to sticking with Thinkpads is their trackpoint, which I use exclusively. In previous models, I disabled the touchpad in the BIOS, but this did not seem to have an effect here, so I added the following section to /etc/X11/xorg.conf.d/30-touchpad.conf
Section "InputClass"
        Identifier "SynPS/2 Synaptics TouchPad"
        MatchProduct "SynPS/2 Synaptics TouchPad"
        Option "ignore" "on"
EndSection
At one point I left out the MatchProduct line, disabling all input in the X server. Had to boot into recovery mode to fix that. Unfortunately, there is something wrong with the trackpoint and the buttons: When I am moving the trackpoint (and maybe if there is actual load on the machine), mouse button press and release events sometimes get lost. This is quite annoying I try to open a folder in Evolution and accidentially move it. I installed the latest Kernel from Debian experimental (4.8.0-rc8), but it did not help. I filed a bug report against libinput although I am not fully sure that that s the culprit. Update: According to Benjamin Tissoires it is a known firmware bug and the appropriate people are working on a work-around. Until then I am advised to keep my palm of the touchpad. Also, I found the trackpoint too slow. I am not sure if it is simply because of the large resolution of the screen, or because some movement events are also swallowed. For now, I simply changed the speed by writing
SUBSYSTEM=="serio", DRIVERS=="psmouse", ATTRS speed ="120"
to /etc/udev/rules.d/10-trackpoint.rules.

Brightness control The system would not automatically react to pressing Fn-F5 and Fn-F6, which are the keys to adjust the brightness. I am unsure about how and by what software component it should be handled, but the solution that I found was to set
Section "Device"
        Identifier  "card0"
        Driver      "intel"
        Option      "Backlight"  "intel_backlight"
        BusID       "PCI:0:2:0"
EndSection
so that the command line tool xbacklight would work, and then use Xmonad keybinds to perform the action, just as I already do for sound control:
    , ((0, xF86XK_Sleep),       spawn "dbus-send --system --print-reply --dest=org.freedesktop.UPower /org/freedesktop/UPower org.freedesktop.UPower.Suspend")
    , ((0, xF86XK_AudioMute), spawn "ponymix toggle")
    , ((0, 0x1008ffb2  - xF86XK_AudioMicMute - ), spawn "ponymix --source toggle")
    , ((0, xF86XK_AudioRaiseVolume), spawn "ponymix increase 5")
    , ((0, xF86XK_AudioLowerVolume), spawn "ponymix decrease 5")
    , ((shiftMask, xF86XK_AudioRaiseVolume), spawn "ponymix increase 5 --max-volume 200")
    , ((shiftMask, xF86XK_AudioLowerVolume), spawn "ponymix decrease 5")
    , ((0, xF86XK_MonBrightnessUp), spawn "xbacklight +10")
    , ((0, xF86XK_MonBrightnessDown), spawn "xbacklight -10")
The T460s does not actually have a sleep button, that line is a reminiscence from my T430s. I suspend the machine by pressing the power button now, thanks to HandlePowerKey=suspend in /etc/systemd/logind.conf.

Profile Weirdness Something strange happend to my environment variables after the move. It is clearly not hardware related, but I simply cannot explain what has changed: All relevant files in /etc look similar enough. I use ~/.profile to extend the PATH and set some other variables. Previously, these settings were in effect in my whole X session, which is started by lightdm with auto-login, followed by xmonad-session. I could find no better way to fix that than stating . ~/.profile early in my ~/.xmonad/xmonad-session-rc. Very strange.

5 September 2016

Joachim Breitner: The new CIS-194

The Haskell minicourse at the University of Pennsylvania, also known as CIS-194, has always had a reach beyond the students of Penn. At least since Brent Yorgey gave the course in 2013, who wrote extensive lecture notes and eventually put the material on Github. This year, it is my turn to give the course. I could not resist making some changes, at least to the first few weeks: Instead of starting with a locally installed compiler, doing execises that revolve mostly around arithmetic and lists, I send my students to CodeWorld, which is a web programming environment created by Chris Smith1. This greatly lowers the initial hurlde of having to set up the local toolchain, and is inclusive towards those who have had little expose to the command line before. Not that I do not expect my students to handle that, but it does not hurt to move that towards later in the course. But more importantly: CodeWorld comes with a nicely designed simple API to create vector graphics, to animate these graphics and even create interactive programs. This means that instead of having to come up with yet another set of exercieses revolving around lists and numbers, I can have the students create Haskell programs that are visual. I believe that this is more motivating and stimulating, and will nudge the students to spend more time programming and thus practicing. In fact, the goal is that in their third homework assignemnt, the students will implement a fully functional, interactive Sokoban game. And all that before talking about the built-in lists or tuples, just with higher order functions and custom datatypes. (Click on the picture above, which is part of the second weeks s homework. You can use the arrow keys to move the figure around and press the escape key to reset the game. Boxes cannot be moved yet -- that will be part of homework 3.) If this sounds interesting to you, and you always wanted to learn Haskell from scratch, feel free to tag along. The lecture notes should be elaborate enough to learn from that, and with the homework problems, you should be able to tell whether you have solved it yourself. Just do not publish your solutions before the due date. Let me know if you have any comments about the course so far. Eventually, I will move to local compilation, use of the interpreter and text-based IO and then start using more of the material of previous iterations of the course, which were held by Richard Eisenberg in 2014 and by Noam Zilberstein in 2015.

  1. Chris has been very helpful in making sure CodeWorld works in a way that suits my course, thanks for that!

30 August 2016

Joachim Breitner: Explicit vertical alignment in Haskell

Chris Done s automatic Haskell formatter hindent is released in a new version, and getting quite a bit of deserved attention. He is polling the Haskell programmers on whether two or four spaces are the right indentation. But that is just cosmetics I am in principle very much in favor of automatic formatting, and I hope that a tool like hindent will eventually be better at formatting code than a human. But it currently is not there yet. Code is literature meant to be read, and good code goes at length to be easily readable, and formatting can carry semantic information. The Haskell syntax was (at least I get that impression) designed to allow the authors to write nicely looking, easy to understand code. One important tool here is vertical alignment of corresponding concepts on different lines. Compare
maze :: Integer -> Integer -> Integer
maze x y
  abs x > 4    abs y > 4  = 0
  abs x == 4   abs y == 4 = 1
  x ==  2    && y <= 0     = 1
  x ==  3    && y <= 0     = 3
  x >= -2    && y == 0     = 4
  otherwise                = 2
with
maze :: Integer -> Integer -> Integer
maze x y
  abs x > 4   abs y > 4 = 0
  abs x == 4   abs y == 4 = 1
  x == 2 && y <= 0 = 1
  x == 3 && y <= 0 = 3
  x >= -2 && y == 0 = 4
  otherwise = 2
The former is a quick to grasp specification, the latter (the output of hindent at the moment) is a desert of numbers and operators. I see two ways forward: What could such ways be? (This post is cross-posted on reddit.) Update (2016-09-05) Shortly after this post, the Haskell formatter brittany gets released, which supports vertial alignment. Yay!

5 July 2016

Joachim Breitner: HaL deadline extended

There is this long-running workshop series Haskell in Leipzig, which is a meeting of all kinds of Haskell-interested folks (beginners, experts, developers, scientists), and for year s instance, HaL 2016, I have the honour of being the program committee chair. The original deadline passed last week, and after looking through the submissions it became clear that although the quality was good, the quantitiy was still lacking. I therefore extended the submission deadline by two weeks, until July 15th. So if you have something worth talking about, please do submit a proposal1 and come to Leipzig!. Why should you submit here? Because it gives you a platform to talk about your ideas to an audience that usually does not consist just of your fellow speakers, as it is often with purely academic workshops, but real listeners of various kinds. And it is a fun event. And why should you come to Leipzig? Because of all the interesting talks and tutorials! Of course I cannot say a lot here yet, besides that our invited speaker Alejandro Russo from Chalmers and Gothenburg University will present his work on information-flow control in Haskell (i.e., SecLib, LIO, MAC, HLIO).

  1. And if you want to save me from sleepless nights, submit the first version a few days before the deadline

1 July 2016

Joachim Breitner: When to reroll a six

This is a story about counter-intuitive probabilities and how a small bit of doubt turned out to be very justified. It begins with the game To Court the King (German: Um Krone und Kragen ). It is a nice game with dice and cards, where you start with a few dice, and use your dice rolls to buy additional cards, which give you extra dice or special powers to modify the dice that you rolled. You can actually roll your dice many times, but every time, you have to set aside at least one die, which you can no longer change or reroll, until eventually all dice have been set aside. A few years ago, I have played this game a lot, both online (on yucata.de) as well as in real life. It soon became apparent that it is almost always better to go for the cards that give you an extra die, instead of those that let you modify the dice. Intuitively, this is because every additional die allows you to re-roll your dice once more. I concluded that if I have a certain number of dice (say, n), and I want to have a sum as high as possible at the end, then it may make sense to reroll as many dice as possible, setting aside only those showing a 6 (because that is the best you can get) or, if there is no dice showing a 6, then a single die with the best score. Besides for small number of dice (2 or 3), where even a 4 or 5 is worth keeping, this seemed to be a simple, obvious and correct strategy to maximize the expected outcome of this simplified game. It is definitely simple and obvious. But some doubt that it was correct remained. Having one more die still in the game (i.e. not set aside) definitely improves your expected score, because you can reroll the dice more often. How large is this advantage? What if it ever exceeds 6 then it would make sense to reroll a 6. The thought was strange, but I could not dismiss it. So I did what one does these days if one has a question: I posed it on the mathematics site of StackExchange. That was January 2015, and nothing happened. I tried to answer it myself a month later, or at least work towards at an answer, and did that by brute force. Using a library for probabilistic calculations for Haskell I could write some code that simply calculated the various expected values of n dice for up to n = 9 (beyond that, my unoptimized code would take too long):
1:  3.50000 (+3.50000)
2:  8.23611 (+4.73611)
3: 13.42490 (+5.18879)
4: 18.84364 (+5.41874)
5: 24.43605 (+5.59241)
6: 30.15198 (+5.71592)
7: 35.95216 (+5.80018)
8: 41.80969 (+5.85753)
9: 47.70676 (+5.89707)
Note that this calculation, although printed as floating point numbers, is performed using fractions of unbounded integers, so there are no rounding issues that could skew the result. The result supported the hypothesis that there is no point in rolling a 6 again: The value of an additional die grows and approaches 6 from beyond, but judging from these number is never going to reach it. Then again nothing happened. Until 14 month later, when some Byron Schmuland came along, found this an interesting puzzle, and set out a 500 point bounty to whoever solved this problem. This attracted a bit attention, and a few not very successful attempts at solving this. Eventually it reached twitter, where Roman Cheplyaka linked to it. Coincidally a day later some joriki came along, and he had a very good idea: Why not make our life easier and think about dice with less sides, and look at 3 instead of 6. This way, and using a more efficient implementation (but still properly using rationals), he could do a similar calculation for up to 50 dice. And it was very lucky that he went to 50, and not just 25, because up to 27, the results were very much as expected, approaching value of +3 from below. But then it surpassed +3 and became +3.000000008463403. In other words: If you have roll 28 dice, and you have exactly two dice showing a 3, then it gives you better expected score if you set aside only one 3, and not both of them. The advantage is minuscule, but that does not matter it is there. From then on, the results behaved strangely. Between 28 and 34, the additional value was larger than 3. Then, from 35 on again lower than 2. It oscillated. Something similar could be observed when the game is played with coins. Eventually, joriki improved his code and applied enough tricks so that he could solve it also for the 6-sided die: The difference of the expected value of 198 dice and having 199 dice is larger than 6 (by 10 21...)! The optimizations that allowed him to calculate these numbers in a reasonable amount of time unfortunately was to assume that my original hypothesis (never rerolling a 6 is optimal), which held until n < 199. But this meant that for n > 199, the code did not yield correct results. What is the rationale of the story? Don t trust common sense when it comes to statistics; don t judge a sequence just from a few initial numbers; if you have an interesting question, post it online and wait for 16 months.

9 May 2016

Joachim Breitner: Doctoral Thesis Published

I have officially published my doctoral thesis Lazy Evaluation: From natural semantics to a machine-checked compiler transformation (DOI: 10.5445/IR/1000054251). The abstract of the 226 page long document that earned me a summa cum laude reads
In order to solve a long-standing problem with list fusion, a new compiler transformation, 'Call Arity' is developed and implemented in the Haskell compiler GHC. It is formally proven to not degrade program performance; the proof is machine-checked using the interactive theorem prover Isabelle. To that end, a formalization of Launchbury s Natural Semantics for Lazy Evaluation is modelled in Isabelle, including a correctness and adequacy proof.
and I assembled all relevant artefacts (the thesis itself, its LaTeX-sources, the Isabelle theories, various patches against GHC, raw benchmark results, errata etc.) at http://www.joachim-breitner.de/thesis/. Other, less retrospective news: My paper on the Incredible Proof Machine got accepted at ITP in Nancy, and I was invited to give a keynote demo about the proof machine at LFMTP in Porto. Exciting!

17 February 2016

Joachim Breitner: Free Paradoxes

Maybe it works if he is a freelancer?

Maybe it works if he is a freelancer?

Next.

Previous.