Dec 12 2007
Supero Followup
I just wanted to post a followup to my original supero post (which comes up first when you google “Supero Haskell”- neat).
First, I wanted to post a link to Neil Mitchell’s blog (the guy behind Supero), who, all egotism aside, really should be the first link google returns (I should be the second). He has some interesting follow up benchmarks to report.
Second, I’d like to explain why I think this is interesting, as a response to Robert’s comment. It’s not that this idea is new in any deep sense. It’s not. What’s new is the extent to which this optimization can be applied- indeed, one the main areas of research on supero is when to stop applying it. All compilers do this, to a greater or lesser extent. Constant folding is an example of this, as is strength reduction, dead code elimination, etc. To be honest, I hadn’t heard of the super-compilation stuff before. But applying it to a “real” language, like C++ or Java or Ruby or Ocaml, strikes me as being exceedingly difficult. To flip the question back- “it’s been done before, years ago- so why isn’t everyone doing it?” Companies like IBM and Sun and Microsoft and Intel have invested man-millenia in their compilers, with literally billions of dollars hanging on who can make the benchmarks go faster, why haven’t they implemented this?
The answer is, I think, because it’s really effin hard, in everything except the trivial cases (or in Haskell) to know when you can safely apply it. When a function doesn’t subtly depend upon global state, or i/o, or something.
Which brings to another thing- the people claiming that “wc can be made 8 times faster with mmap!” Um, maybe, yeah (then again, maybe not). But not without hundreds of lines of code, and completely obscuring what you’re doing. You could probably make it faster yet if you wrote it in hand-tuned assembly language. What’s amazing about this is that the Haskell side is one line of code. Which is simple enough that you could probably guess what it did even if you didn’t know Haskell.
As a geneneral rule, what we need out of a programming language performance-wise, is simply performance that doesn’t suck, that can be acheived easily. Programmer cycles are way more expensive than CPU cycles these days. And there is a performance advantage to the code being so simple it’s obviously correct- what other optimizations are possible with all of those hundreds of lines of code the Haskell programmer didn’t have to write? With a benchmark as simple wc, probably not many. For any sort of more complicated program, the possibilities just open up.
I wrote, in a follow-on comment to my Popularity: 4% [?]
Your mark-up is broke.
This is pretty cool that it’s being applied so widely, but I kinda expected this out of functional languages. So much is known at compile-time that I’d expect huge gains — in fact, I guess I was implicitly thinking that this was happening in OCaml, which is part of the reason it’s so speedy.
No. Ocaml actually isn’t that good at optimization. I mean, it does the simple things- peephole optimization, inlining, a few others- and it has an excellent garbage collector (even for ML languages), and as a language it allows you to write non-stupid code. But compare even to GCC 2.94, it’s doesn’t optimize that much (let alone comparing it to GCC 4). Also, it helps that modern CPUs are surprisingly good at “optimizing” less than optimal code.
The dirty little secret of the “popular” languages are that optimization doesn’t help you much. Something I forgot to mention is Proebstring’s Law: that increases in compiler optimization double program speed every 18 years. That’s compared to the common misquoting of Moore’s law that has CPU speed doubling every 18 months.
And I think Proebstring’s law misrepresents the situation, to the benefit of making the optimization improvements look better that they really are. The justification for Proebstring’s law is that after 36 years of progress, programs are about 4x faster. But the vast bulk of that improvement was the low hanging fruit- a small selection of “simple” optimizations- peephole optimizers, register allocation, strength reduction, inlining- gives an easy 2x speed improvement, and were well known 36 years ago (1971). Which means over the last 36 years, compilers have only doubled the performance of programs- or about a 2% gain per year.
Parenthetically, this is why Ocaml can still be “speedy” and not have complicated, expensive optimizations- the complicated, expensive optimizations simply don’t buy you that much. It does the simple optimizations which net a lot of performance gain for (comparatively) little pain, but doesn’t bother with the optimizations which are lots of pain for little gain.
Then wham- out of left field comes a grad student and an optimization speeding code up by 10% to three fold. That’s like 5 years to 55 years worth of compiler optimization development at current rates, all at one shot. And he’s not done- there are probably (almost certainly) improvements to that to be made.
An interesting set of slides on Proebstring’s law and the future of optimization research is here.
His conclusion there is, I think, fundamentally pessimistic- basically that we’ve hit the point of diminishing returns (at least for popular languages like C++, Fortran, and Java), and that the future of compiler optimizations is going to focus on a small segment of special users- High Performance Computing- and leave everyone else out to dry.
Which is yet another interesting aspect of Supero- he’s not focusing on “high dollar value” customers. I mean, his basic benchmark is word count. That’s about as plebian as it gets.