I just recently came across this paper (with the same title as this blog post), and just had to post it and say “wow”. The idea of the paper is pretty simple, actually, for Haskell at least- see, the advantage of purely functional code is that it doesn’t matter when it actually gets executed. Haskell’s lazy evaluation enforces this (the “hair shirt” that SPJ talks about). Purely functional code can run at any time- including compilation time. So why not? Hoist as much code as is resonable and precompute it at compile time. Everything that isn’t directly dependent upon I/O is a possible target. Which means that large chunks, even a majority, of Haskell programs are potiential targets for this optimization.
Now this work is obviously in the early stages, and is yet to be proven out on real world code. And there are a lot of open questions left- like what are good heuristics for when it’s profitable to apply this optimization? But it’s an example of the sorts of large scale programmatic rewrites which are possible with Haskell, but difficult bordering on impossible even with Ocaml, let alone C++ or Java. There’s a whole realm of compiler optimizations out there which hasn’t been explored- because with normal languages the optimizations are just too hard to implement, and so likely to not be applicable in the majority of cases, diluting the effectiveness of the optimization. But the structure of Haskell makes it easy to apply to large parts of the code (“is it using the I/O monad? No? Good, we can precompute it”), and easy to implement.
8 Comments
How is this different in nature than any other kind of optimization? It’s certainly different in scope due to language-based limits of knowledge, but ultimately most optimizations boil down to either 1) shuffling instructions around to pander to the hardware (which this isn’t), or 2) pre-computing values and trimming the cruft off (which this is).
(caveat: I’m neither a Haskell expert or a compiler writer)
It does just look like “pre-computing and trimming”, but the scope of what can be pre-computed is larger because side effects are quarantined. In the slides, range, sum and main are optimized to:
main n = main2 i n
main2 i n = if i > n then 0 else i + main2 (i+1) n
(though I think that first line should be
main n = main2 0 n)Sum and range aren’t called at all, instead they’re replaced with their semantic equivalent… sort of a super inlining. How would this be done cleanly in the presence of side-effects?
Sure, it’s got a lot wider scope (I thought I said that?) because of the awesomeness of functional programming, but it’s still just the same style of optimization as done before. I threw my comment out there because I genuinely wanted clarification: usually, when Brian is deeply taken with something, it’s usually a new idea that hits my brain like a pan-galactic gargle blaster.
In all honesty, I’m surprised this wasn’t being done before. :-D
Ah, I see. I guess the scope was interesting enough for me, as to anything deeper, I’ll leave that to Brian.
It has been done before. It was done by Turchin the in late 60s for Refal, and he called it supercompilation. What Neil is doing is positive supercompilation if I remember right.
As augustss says, it has been done before, but its a while since it’s been applied to a full scale functional language. It is indeed positive supercompilation. The original paper that is on my website (http://www-users.cs.york.ac.uk/~ndm/supero/) follows the ideas of supercompilation relatively closely. The new paper which I’ve just submitted makes a few more changes from the traditional supercompilation approach, which are needed in a larger programming language.
It is quite different from C program optimisation and even optimisation in a GHC/Clean style manner. Those make small incremental changes to the program, moving from one program to another very similar but slightly faster program – then these changes are iterated a lot. Supercompilation starts at one program, and ends at another program, without any kind of step-by-step process.
It’s nice to see someone is interested :)
By saying it has been done before I didn’t mean to belittle Neil’s work. The main idea might be old, but the devil is in the detail. Doing it efficiently and well for a real language is not easy, and I hope Neil will succeed.
I wasn’t going to post up my new draft paper, but following all the interest, I think it makes sense if everyone can see the latest and greatest supero stuff. That paper does cover more of the details – the original paper was written with a more limited knowledge of existing supercompilation, the revised paper builds on existing work so is more powerful.
2 Trackbacks
[...] CandidateDevelopment Acceleration: The Second Derivative of FunctionalityCertification? Please, no.Supero: Making Haskell FasterCompassionate [...]
[...] Supero: Making Haskell Faster [...]