Edit: Once you get the gist of this rant, jump to the comments for a slightly more reasoned approach. Or my follow-up post which attempts to re-open the dialog.
So, I’m trying to do a little bit of XML reading/writing. Nothing major — read in an XML, grab out some values, and then store the raw XML into the database. I’m doing pretty much the same thing in Groovy, and the
XmlSlurper made that blissfully easy.
Since the core library comes with the REXML parser, I figured that it was a nice, stable library, and I’d roll with it. The interface wasn’t as nice as XmlSlurper, but it seems like it would do.
This was the start of the pain.
In fact, the pain pissed me off enough to share my frustrations with the world. Hopefully someone finds this useful, and they can avoid the pain and suffering I put up with. And yes, I could spend the time I’m griping going through and fixing up all the bugs, but I shouldn’t have to for a language as mature as Ruby. Core libraries are supposed to be stable, reliable beasties. If I wanted to spend all my time debugging half-baked implementations or rolling my own solutions, I’d never leave Ocaml — I come to Ruby for the community support. That’s supposed to be the big advantage.
Anyway, here we go:
Problem #1 came along when I tried to parse XML. First of all, the API documentation completely sucks — if you look at the top level REXML package, it’s totally worthless. If you manage to figure out that it’s REXML::Document that you probably want, you’re still not much better off. If you check out #new, which is really what you probably want, you’re rewarded with one word: “Constructor” You also have some “@param” tags that ran together and tell you things like the second argument, called “context”, should be a Hash of the context. That clears up a lot! And, seriously, if you’re telling me that it should be a Hash in the documentation, why aren’t we just doing implied static typing and being done with it?
Anyway, I retreated to Google, found the REXML tutorial, and managed to figure it out from there.
But then I kept having this annoying bug: when I called Element#text(), it was not only ignoring my instructions to leave entities alone (i.e. don’t turn “<” into “<”), but it then seemed to go through and attempt to re-parse it, because it was complaining about unbalanced tags! Principle of Least Surprise my ass(1)! I’m not sure why the second part of that was happening, but the first part is apparently documented, so I stopped using the easy-to-read convenience method and went to Element#write.
This is where the real pain began. See, Element#write is broken. Deprecated and broken, actually. But the tutorial still tells you to use it. The solution is to use their Formatter approach. Except — ready for it? — that’s broken, too! No, I’m not kidding. In this language core library, both versions are broke! The solution is for me to reach in and make a change to the core library so that we avoid a null. In the standard Ruby deployment, using the standard core XML processing library, there is no way to write out XML. It is impossible because of bugs in the library.
The worst part?
THAT STUPID BUG IN THEIR CORE LIBRARY WOULD HAVE BEEN FIXED WITH STATIC TYPING. Even more if you have a type system which can check nulls for you. Null pointers/”nil when you didn’t expect it!” errors are totally solvable problems. The fact that our industry hasn’t moved past this painful left-over from C is driving me crazy. The next person who tries to tell me that dynamic typing is the best thing since sliced bread is going to get an earful. It is a flat-out wrong position, and I’m done hearing otherwise from anyone.
(1) As much as I’d love to claim that quote, it actually comes from Paul Cantrell’s excellent exploration of closures in Ruby.
Popularity: 20% [?]