I’m having a moderately interesting experience. Long story short: I’m looking at rewritting the Java collection classes. Now, one interesting point about Java is that it encourages deep inheritance trees. So I started my design by writting out what “leaf classes” I would have in my inheritance tree (more specifically, what concrete implementations I’d have for the leafs of my interface inheritance tree). This is often a usefull if somewhat waterfallish approach to class heirarchy design. In any event, I’ve ended up with eight or so major interfaces.
Which now I need names for.
At which point I run head first and at speed into one of the common problems of programming: that there aren’t simply enough words on generic concepts. How often do programmers need different names for “a thing that holds other things”? I mean, the standard Java library already has an array or two, a binding or two, a box, a buffer, a collection, a composite, a container, a dictionary, a document or two, an element or two (or three), an entity or two, an enumeration, an environment, a field, a file, a format, a frame, a group, an iterator, a level, a list or two, a locale, a locator, a manifest, a map, a member, a menu, a package, a panel, a pipe, a port, a reference or two, a registry, a segment, a set, a sequence, a socket, a source, a stack, a state, a string, a subject, a system, a track, a transmitter, a vector, a view, and a window. And that’s just in the standard library.
And it’s not just with “things that hold things”. It’s an occupational hazard in programming to try and generalize things as much as possible. Java also has an action (figures), an adjustable, an any (nouning verbs weirds the language), an attribute or two (the intelligence of this is it’s strength), or even more attributes, a blob (no SteveMcQueen however), a choice (choose wisely), class (upper or lower not specified), control (and, in case you lose it, more control), a destination (and a driver to get you there), an element (only three elements, however- we never get to the fifth), an error (which is invalid), an event, an expression (a frown, apparently), a handler, a method, a number, an object or two, an option (stock), a parameter, a subject, and a transformer (more than meets the eye). We’re seeing a strong tendency towards things having qualities.
It’s an impedence mismatch between the needs of natural language and the needs of computer programming. I mean, in normal conversation, how many different words do you need to mean “a generic thing that can hold other things”? Programming is already grabbing words with much more specific meanings and wiped the detail away. For example, in english, the words list, array, sequence, enumeration, and table, all mean more or less the same thing (very close). You can generally use any one in place of any of the others with little impact on meaning. In programming, they get assigned specific, often times subtle, meaning. For example, “list” generally implies some form of linked list, where prepending to the head of the list will be cheap, but getting an arbitrary element from the middle of the list will be expensive. “Array”, however, tends to make me think that getting arbitrary elements from the middle will be cheap, but prepending an element will be expensive. Use the right one, and the code flies, use the wrong one, and it’s slower than molassas in january. I start being picky about wether it’s a “list” or an “array”- this makes a difference!
This reuse bites us on the tush on a regular basis as well. In discussing Java, I once uttered the sentence “Every object of Class Object has an object of class Class which representes the class of the object.” Which is true (rather like that, in C++, the definition of a friend is someone who can see your private parts)- but rather humorously, much easier to parse when written out than spoken, as the capitalization is a clue as to when I’m using Object or Class as a name, and when I’m using it as a term.
All of these classes that I’ve been picking on have well defined local meanings. It’s only when viewed in aggregate that the problem becomes apparent. It’s not just Java that has this problem, I hasten to point out- pretty much any programming language of sufficient complexity and breadth of library support runs into the same problem (consider that Ocaml uses “map” both to mean an operation and a data structure- and it makes sense to “map a map”). This is a problem of programming. It arises from the natural desire of programmers to generalize as much as possible- which encourages reuse, a good thing. And it’s most apparent when you’re trying to think up more names- different from those already in use, but memorable and spellable, to mean yet more things that have other things. You end up asking yourself questions like “is a list more like an agglomeration, or like a multitude?” It’s only the facade of professionalism that keeps there from being half a dozen classes named “thing”.
Once we’re twisted the generic english word to fit our precise computer definition, we often forget where the original word came from. In english, the words “enumeration” and “list” are basically identical. In C/C++, however, an enumeration is a way to assign a list of names specific values (while a list, naturally, is an enumeration of elements). When switching to Java, which has a radically different concept of what an enumeration is (a way to view a list), many C programmers balk and have difficulty with this new meaning of the word. Dammit, that’s not what an enumeration is. We twist the words to have a specific meaning in a given limited context- but the meaning of the word inevitably escapes that context. When switching to Ocaml, Java programmers will in turn have problems with the word “map” (dealing with the operation, not the data structure). We can’t seem to leave the specific local meanings of a word behind. In this way we’re the opposite of Humpty Dumpty in this way- a word means what it means no matter what.
I’m not sure what the solution here is- or if there is even a solution beyond grumble, laugh, and thumb through the thesaurus yet again. The advantage of hooking into the natual language knowledge of the programmer (to give easier to remember names and meanings) is much greater than the problems in creates. In fact, the words keeping their meaning even outside of the locale they were originally defined in speaks to the power of this approach. It taps into something deep in how we, as humans, work with language. The problem still remains: I still need more words that mean “things that hold other things”. Bunch? Bundle? Multitude?
Popularity: 3% [?]