Surfing on Chia’s Functional Longing article, I wanted to post an experience I had recently, working on some C# code. The point of this blog entry is that it’s not what a programming language make possible, it’s what a programming language make easy which determines what patterns are common and what patterns aren’t.
What the code was doing was making a bunch of COM calls to another program, which wasn’t ours- specifically, it was a plugin to excel. Thus I was coding this in C#, because of the three possible languages, VB.NET, C#, and C++, I disliked C# the least. Now, the problem is that this particular plugin was not a model for robust, well designed code. Actually, it’s pretty close to being a poster child for bad code, and the developers are obviously well into the active/stupid quadrant (among other things, they have reimplemented SQL, badly…). In any case, talking COM to this combination has a bad habit of throwing COMExceptions.
Now, sometimes, when it throws a COMException, what it really means is “I’m busy right now- try again later”, and backing off for a while and trying again will allow the call to succeed. Sometimes, what it really means is “I’m totally screwed- kill excel, run this other program to clean up the debris, and start the whole process over again.” Unfortunately, there’s no easy way to tell which is which- except that the latter is a permanent state.
So I had this whole block of code which caught the thrown COMException exceptions, and repeatedly backed off and tried again, until they gave up. And this block of code was cut & pasted everywhere I needed to make a COM call. There is a reason this style of programming is considered bad, as we discovered in the Y2K fiasco. The problem is that when you need to fix the cut and pasted code, you need to track down everywhere it was cut and pasted, and fix it everywhere. Which was the problem with Y2K- not fixing the problem, but finding all the damned places that needed to be fixed. The inevitable happened, of course- the inevitable has a distressing way of always happening. I needed to fix all my cut and pasted code. While fixing the problem, I also wanted to refactor the code so that the next time I had to fix this problem, I’d only have to fix it in one place.
And ran smack dab into a functional/object oriented impedance mismatch.
See, there’s a pattern I’ve gotten used to in Ocaml- and I’d even used it in Perl, which I think of as the “hole in the middle” pattern. The basic idea is that you have two pieces of code that are almost exactly identical, except for that little middle part there. The idea is that you factor out the common code into a single function, which takes a function pointer as an argument. The middle part in the shared code is replaced by a call to the function pointer, and the two places which are being combined simply call the combined function with a pointer to a function that contains the unique part.
In my COMException example, the code I wanted to write would have looked something like this in Ocaml:
let run_com_command f =
try
f ()
with
| COMException e ->
(* fancy backoff/restart logic here *)
Most of the places where I was doing COM commands were, excepting the infrastructure of backing off and/or restarting, one liners. Unfortunately, as we shall see, they were one-liners that used and set local variables in non-static context. For Ocaml, no problem. Just use fun:
let local_var = some value in
let set_value = run_com_command (fun () -> my com command) in
…
Because it’s so easy, and keeps things simple, Ocaml encourages the use of this pattern. There is very little overhead to factoring out code in this way, so even if the commonly factored code is small, it’s still useful. Standard Ocaml functions like map, fold, and iter can be seen as examples of this hole in the middle pattern. You’re running down a list, doing something to every element of the list- all code doing that is exactly the same, except for that middle part- what exactly it is you’re doing with every element. It’s not like the Ocaml code for running down a list is all the complicated- it’s about three lines of code, not counting the middle bit. But as we’re replacing three lines of code with a single line, it’s still worth it, even in marginal cases like this. And note that the my com command code above can simply access the local_var variable without doing anything special (Ocaml simply turns it into a hidden extra parameter to the created function, and then partially applies it).
And the number of “middle bits” is often more than one. These pieces of code are exactly identical except for this bit here, and that bit over there. OK, just pass in two functions, one for each bit. Or three functions, if there are three places of difference. Surprisingly dissimilar code can be factored into a common base and three replaceable bits. For example, one tree implementation I wrote had insert, delete, and find all sharing the same tree walking code.
Unfortunately for me, I wasn’t programming in Ocaml, I was programming in C#.
The fact that I was needing to call this common code from non-static environments and access local variables meant that I couldn’t use C# delegates. Now, it’s possible (probable) that I’m using an older version of C#, and that delegate usefulness has improved. But believe me, I tried delegates. I spent the better part of a day trying to get delegates to work, including emailing Chia for help- he in turn bounced me to the guy he asks C# questions of, who in turn bounced me to his guru, who bounced me to his guru. None of them could help me out (although to be honest, I’m not sure how many really tried).
Which meant that I was forced back to the old tried and true method- the “doit” object. This is a pattern that I see as very common in classic OO programming circles- the class with a single interesting function, the “doit” function. Generally the “doit” function has a slightly better name, like MouseClickEventAction or some such, but that’s what it amounts to. Then, rather than just passing the function around, you now pass an object of the Doit class around instead.
Note that a doit class does have one advantage over an unadorned function reference, such as in Perl or C/C++, in that it allows you to pass state around with the function. The general pattern I’ve falled into with both Perl and C is to pass a context variable around with function references, with some suitable generic type (void * in C) to pass in to the function when I call it, fulfilling the same role as the object does. I Ocaml, this is handled by partial function application or simply directly accessing local state.
The problem with this is code verbosity. Notice that in C#, at the place where I’m implementing the shared code, I have to:
- define the doit class (probably as an interface, definitely as an abstract base class),
- declare that the function takes an argument of the doit class type, and
- call the doit function at the correct point.
- Define a class that implements the correct interface,
- define member variables to hold copies of the local variables I need access (read or write) to, so I can store them within the object, and initialize those member variables that I am writting
- define a constructor for that class that allows me to copy in the relevant local variables I need to read into the member variables,
- define the doit function, the one real line of code, which sets member variables of the objects for those local variables I am writing,
- allocate a new object of the local doit class, copying the read variables in,
- call the common function with the new object just referenced, and
- copy the written variables out of the doit object and into their correct local variables.
That’s 7-10 lines of code every time I want to call this function. It’s still a win in this case, as I’m replacing 50 lines of code with 10. But of those 10 lines of code, all but one of them is infrastructure- code that simply gets in the way of understanding what is really happening. All the real action is packed up into that one line. This is one of the reasons Ocaml code can be both significantly shorter and significantly more readable- the code that Ocaml tends to abbreviate heavily is the infrastructure and overhead- Ocaml drops the noise code, not the signal code.
This is also why comprehensions- functions like map, fold, and iter- are huge in Ocaml (and similar functional languages like Haskell and SML), but virtually unknown in languages like C# and Java. The Ocaml programmer looks at it as replacing three lines of code with one, and thus a win. The C# programmer looks at is as replacing three lines of code with ten, and thus not a win. It’s possible to do this in C#, it’s just not easy.
This “minor” change in code factorization has a significant conceptual impact, however. The other way to look at map, fold, and iter are as operations on whole data structures. This makes the relational calculus seem somehow more “natural” to functional programmers, as the relational calculus is, at heart, about operations on whole data structures (tables aka lists/arrays aka relations). Object Oriented programmers, conditioned as they are to treating all data structures as simple data stores, and of not operating on whole data structures but instead simply accessing, inserting, and removing elements, quickly run into a Viet Nam-like quagmire. You get a quagmire when you ignore how the locals think and want things to work, and instead insist on imposing your own paradigm. This is what I was trying to say with this blog post. Small changes in what a programming language makes easy can lead to large changes in how programmers think about programming. And that these changes can have consequence far out of range of the size of change being made.
Related posts:
Pingback: Thoughts on the Science of Computing : What's in a language?
Pingback: Paint.NET » Blog Archive » Continuation-Passing Style Simplifies Your Exception Handling Code
Pingback: Enfranchised Mind » Functional (Meta)?Programming Stunts for Ruby and Groovy (and a Little Perl)
Pingback: Functional C#: The hole in the middle pattern at Mark Needham
Pingback: IT Blog
Pingback: Sockets and Bockets Part 3 | Moirae Software Engineering Ltd.