Introduction
One of the things that has consistently been difficult in the whole dynamic typing/static typing conversation is that people don’t seem to understand what a real static typing language can do: here’s a classic example (and someone else who was also annoyed). The dynamic typing vs. static typing conversation seems to be Java’s type system vs. Ruby’s type system, which simply isn’t fair. So, in the spirit of advancing discourse and helping people understand why I enjoy Ocaml so much, let me present…
7 Actually Useful Things You Didn’t Know Static Typing Could Do
Run Without a Distinct Compilation Step
Perhaps the most useful tool in the Ruby coder’s toolbox, the Read-Evaluate-Print Loop (REPL) called “irb” is the sketchbook where your day-to-day code artist tries out new efforts and exercises the language. It’s also the place where you go to start digging through implementation reflection to see what methods you really have, and how those methods actually behave. That method that squares the integer passed in — what happens when you pass in a string? Those kinds of things.
An analogous tool is also provided for Ocaml developers. If you just call the “ocaml” program without an argument, it drops into a REPL — referred to as the “toplevel” — where you can try things out and do all the same kinds of rapid development practices.
The toplevel produces feedback like this:
# let input () = print_string "This is output!\n";;
val input : unit -> unit = <fun>
# input ();;
This is output!
- : unit = ()
(I’m going to be using toplevel printouts later on in the article, so you’re going to want to get used to seeing them: the # denotes the start of input. Blue denotes standard output. Red denotes REPL feedback (typing information).)
Similarly, an Ocaml source file can be executed directly. When run this way, it still provides type safety, and this is a nice way to check your code out as you’re building it. Personally, I’m not a big fan of Ocaml as a scripting language (Perl still wins for me, even over Ruby), but Brian claims it holds its own: “as gigawatt lasers go, this one works remarkably well for killing flies.”
Make NullPointerExceptions/nil-when-you-didn’t-expect-it a Thing of the Past
That’s right. If you switch to Ocaml, the compiler will guaranty that you never see a $!# NullPointerException (or equivalent) again. Ever. Really.
Null/nil is so obnoxious and pervasive in Java and the dynamic languages that Groovy introduced the “safe navigation operator” (?.) and the “Elvis operator”. Ruby hasn’t come to a final decision yet, but they’re constantly digging around at how this should work (someone want to provide cites?). Perl and Groovy also hijack boolean coercion to make a null object equal to false, which opens up for the common bug where an int value is 0 and so gets eaten in what was supposed to be a null check. Back in Perl, people discovered that “0e0″ evaluated to true and started returning that freak of floating point arithmetic to mean “zero-but-true” to avoid the coercion (example). This, of course, just leads to the new bug of people using “unless $var” to actually check for 0 and having it fail on them (example with hacky code). Irony, thy name is operator overloading.
Ocaml saves us from this Hell by reversing the default behavior that you see in C++/Java/Perl/Ruby: instead of variables being nullable by default, you have to explicitly declare that this variable you’re using is possibly null (called “None”). If you’ve been declaring variables at the point of assignment, as I’ve been advocating ([1], [2], [3]), then you already know that you can usually get rid of null without missing it much.
In those cases when you do miss it in Ocaml, you can use the “`a option” type. Expressions of that type may or may not have a value, and the user code is required to figure out what to do in either case. This is what it looks like:
# let x = Some(1);;
val x : int option = Some 1
# let y = None;;
val y : 'a option = None
# let foo x = match x with Some(i) -> print_int i; print_string "\n" | None -> print_string "None provided!\n";;
val foo : int option -> unit = <fun>
# foo x;;
1
- : unit = ()
# foo y;;
None provided!
- : unit = ()
This pattern makes the API a bit more obnoxious to use, which actually means that possibly null variables are used quite a bit less in Ocaml in other languages. This is to be construed as a Good Thing: having the compiler guaranty that your objects aren’t null means that you don’t have to spend all your time writing “newvar = var unless var” or other (usually redundant) safety code. And it means that you don’t have to either waste your typing in the documentation explaining what happens when a null pointer is passed in, or leave your code to behave arbitrarily in that case.
Duck Typing
This is the big feather in the dynamic language enthusiast’s cap: duck typing. Not having to explicitly type every variable and parameter and return value is really nice. It makes your code a lot easier to read by increasing the signal to noise ratio: redundant tokens can be knocked out of the code, and just the core logic is left behind. It also enables that wonderful productivity boost of surprising reuse in more places: you don’t have to be a Java coder for too long before you discover a place you’d really like to pass in a similar object, but can’t because they don’t have the exact same type.
Ocaml allows this same duck typing gains as the dynamic languages without sacrificing the type safety. Objects, including parameters and return values, are structurally and not nominally typed: methods require the parameters to implement “#foo()”, “#bar(int)”, “#baz(string, list)” instead of requiring the parameters to belong to a certain element of the type tree which has “#foo()”, “#bar(int)”, and “#baz(string, list)” as part of its API contract. Even better yet, Ocaml will derive the required structure from the method implementation itself, so the developer doesn’t need to specify anything.
Here’s what it looks like in play. Consider this Ruby code, written by someone who decided to ignore the whole comparison/equality inheritance problem by using duck typing:
def nudge(pt) pt.move(1, 1) end class TwoDeePoint attr_accessor :x, :y def initialize @x = 1 @y = 2 end def move(dx=1, dy=1) @x += dx @y += dy end def put print "(#{@x}, #{@y})" end end class ThreeDeePoint attr_accessor :x, :y, :z def initialize @x = 3 @y = 4 @z = 5 end def move(dx=1, dy=1, dz=1) @x += dx @y += dy @z += dz end def put print "(#{@x}, #{@y}, #{@z})" end end class Whatever def put print "Whatever..." end end def do_put(obj) obj.put() puts end two_d = TwoDeePoint.new do_put two_d three_d = ThreeDeePoint.new do_put three_d whatev = Whatever.new do_put whatev nudge two_d two_d.put nudge three_d three_d.put puts
Lots of cool duck typing going on there. Here’s the Ocaml version:
let nudge pt = pt#move2 1 1 class two_dee_point = object val mutable x = 1 val mutable y = 2 method move dx = x <- x + dx method move2 dx dy = x <- x + dx; y <- y + dy method put = print_string ("(" ^ (string_of_int x) ^ ", " ^ (string_of_int y) ^ ")") end;; class three_dee_point = object val mutable x = 3 val mutable y = 4 val mutable z = 5 method move dx = x <- x + dx method move2 dx dy = x <- x + dx; y <- y + dy method move3 dx dy dz = x <- x + dx; y <- y + dy; z <- z + dz method put = print_string ("(" ^ (string_of_int x) ^ ", " ^ (string_of_int y) ^ ", " ^ (string_of_int z) ^ ")") end;; class whatever = object method put = print_string "Whatever..." end;; let do_put obj = obj#put; print_string "\n";; let two_d = new two_dee_point;; do_put two_d;; let three_d = new three_dee_point;; do_put three_d;; let whatev = new whatever;; do_put whatev;; nudge two_d;; two_d#put;; nudge three_d;; three_d#put;; print_string "\n";;
There’s a lot more conversation about this in the comments below. And here’s an interesting little statistic courtesy of our buddy wc -l:
42 duckTyping.ml
68 duckTyping.rb
DSLs
The whole idea behind DSLs is one that I’ve never really gotten my head around. When they first came out, the promise was that they would relieve developers of all that dreary per-instance configuration by allowing business people to jump in and write code directly. This is something which set of my snake oil alarms, because I had heard it before about Java bean-based GUIs (“draw a line from the output of this bean to that input of that bean — that simple!”) and graphical scheduler front-ends (“create a box at this schedule, and then type in the name of the thing you want to run — that simple!”).
So far, my cynicism seems to be validated: asking around now, it seems like the definition of “DSL” is starting to resemble the definition of “API” — and I’m not the only person who has noticed[1]. This kind of API — where you say what you mean instead of building up whole towers of new DoIt(foo, bar) and new ParamHolder(baz, frodo) structures — is definitely an advancement, and dynamic languages like Ruby and Groovy are right to be proud of the ease of their use.
But while this is an advance, it’s not a novel invention — there is already prior art (see another reinvention here). What the Rubyists call a “DSL”, Ocamlists call “readable code”[2]. Ocaml provides two very powerful tools for writing DSL-esque code simply and easily: variant types and matching.
Consider this implementation of Rail’s Numeric Time extension:
# #load "unix.cma";;
# open Unix;;
# type scale = Hour | Hours | Day | Days | Week | Weeks;;
type scale = Hour | Hours | Day | Days | Week | Weeks
# type direction = Ago | Hence;;
type direction = Ago | Hence
# let date x y z =
let seconds_per_hour = 60.0 *. 60.0 in
let scl = match y with
| Hour -> seconds_per_hour
| Hours -> seconds_per_hour
| Day -> 24.0 *. seconds_per_hour
| Days -> 24.0 *. seconds_per_hour
| Week -> 24.0 *. 7.0 *. seconds_per_hour
| Weeks -> 24.0 *. 7.0 *. seconds_per_hour
in
let amt = match z with
| Ago -> -1.0 *. scl *. (float_of_int x)
| Hence -> scl *. (float_of_int x)
in
localtime (gettimeofday() +. amt)
;;
val date : int -> scale -> direction -> Unix.tm = <fun>
# let print_datetime t =
print_int t.tm_mday;
print_string "/";
print_int (1 + t.tm_mon);
print_string "/";
print_int (1900 + t.tm_year);
print_string " ";
print_int t.tm_hour;
print_string ":";
print_int t.tm_min;
print_string ":";
print_int t.tm_sec;
print_string "\n"
;;
val print_datetime : Unix.tm -> unit = <fun>
# let example1 = date 5 Days Ago;;
val example1 : Unix.tm =
{tm_sec = 11; tm_min = 17; tm_hour = 9; tm_mday = 9; tm_mon = 3;
tm_year = 108; tm_wday = 3; tm_yday = 99; tm_isdst = true}
# let example2 = date 1 Hour Hence;;
val example2 : Unix.tm =
{tm_sec = 34; tm_min = 17; tm_hour = 10; tm_mday = 14; tm_mon = 3;
tm_year = 108; tm_wday = 1; tm_yday = 104; tm_isdst = true}
# print_datetime example1;;
9/4/2008 9:17:11
- : unit = ()
# print_datetime example2;;
14/4/2008 10:17:34
- : unit = ()
I haven’t done anything cute to get around having to put the word “date” in front of the numbers, but that’s more because I like how it reads: “let example one be the date five days ago.”
[1]“Have you ever programmed in a language other than Ruby? (PHP and HTML don’t count.) If not, it’s a DSL.” ROFLMAOPIMP!
[2]Not to be confused with literate progamming.
Passing Blocks Without Pain
For some reason, there is a sense (sometimes explicit) that static languages can’t handle closures — that closures and functional programming are uniquely the domain of dynamically typed languages, and that statically typed languages are struggling to offer any kind of real support.
Despite that popular untruth and Ruby supporter’s touting of Ruby as “a sort of OO / functional hybrid” (cite), this is one of the big pain points in Ruby, and there’s a lot of experimentation to figure out how to fix it. For why this is pain in Ruby, check out Paul Cantrell’s excellent “Closures in Ruby” — for checking out approaches to fix the problem, see the experimental Ruby 1.9 syntax extensions here, here, and here. In short, it all tracks back to Ruby not having first-level functions, and therefore not being a “functional/OO hybrid”, as much as it may want to sell itself as such. It’s aggravated by the fact that blocks and Procs are different things, which causes a lot of pain as demonstrated here:
puts "General function passing demonstration"
def partition(lst, &check)
yes = lst.select(&check)
no = lst.select { |item| !(check.call(item)) }
return [yes, no]
end
lst = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
puts "List:\t#{lst.join("\t")}"
out = partition(lst) { |item| item % 2 == 0 }
yes = out[0]
no = out[1]
puts "Yes:\t#{yes.join("\t")}"
puts "No:\t#{no.join("\t")}"
puts "Nontrivial closure demonstration"
def mod_closure_maker(mod)
return Proc.new { |item| item % mod == 0 }
end
# NOTE: out = partition(lst, mod_closure_maker(3)) doesn't compile
# closures.rb:25:in `partition': wrong number of arguments (2 for 1) (ArgumentError)
def partition2(lst, check)
partition(lst) { |item| check.call(item) }
end
mod3 = mod_closure_maker(3)
mod4 = mod_closure_maker(4)
out = partition2(lst, mod3)
yes = out[0]
no = out[1]
puts "--0 Mod 3--"
puts "Yes:\t#{yes.join("\t")}"
puts "No:\t#{no.join("\t")}"
out = partition2(lst, mod4)
yes = out[0]
no = out[1]
puts "--0 Mod 4--"
puts "Yes:\t#{yes.join("\t")}"
puts "No:\t#{no.join("\t")}"
# NOTE: out = partition2(lst) { |item| item % 5 == 0 } doesn't compile
# closures.rb:46:in `partition2': wrong number of arguments (1 for 2) (ArgumentError)
# This is the way to handle that partition
def partition3(lst, check_proc=nil, &check_block)
if(check_proc)
partition2(lst, check_proc)
else
partition(lst, check_block)
end
end
So, by way of some optional-argument hackery, we can get to the point where we can either pass a block or a Proc. Or both, really, but the Proc will be ignored if the block is provided. And if you want to pass two blocks to a method in Ruby, you’re SOL.
Ocaml, being a real functional language, handles this with grace.
let puts x = print_string x; print_string "\n";;
let rec puts_just_list lst =
match lst with
| [] -> ()
| x :: xs -> print_int x ; print_string "\t"; puts_just_list xs
;;
let puts_list title lst = print_string title; print_string "\t"; puts_just_list lst; print_string "\n" ;;
puts "General function passing demonstration\n";;
let partition lst check = List.partition check lst;;
let lst = [1; 2; 3; 4; 5; 6; 7; 8; 9; 10];;
puts_list "List:" lst;;
let out = partition lst (function item -> item mod 2 = 0) in
let yes = fst out in
let no = snd out in
puts_list "Yes:" yes;
puts_list "No:" no
;;
puts "Nontrivial closure demonstration"
let mod_closure_maker x = function y -> y mod x = 0
let mod3 = mod_closure_maker 3;;
let mod4 = mod_closure_maker 4;;
let out = partition lst mod3 in
let yes = fst out in
let no = snd out in
puts "--0 mod 3--";
puts_list "Yes:" yes;
puts_list "No:" no
;;
let out = partition lst mod4 in
let yes = fst out in
let no = snd out in
puts "--o mod 4--";
puts_list "Yes:" yes;
puts_list "No:" no
;;
Succinct Data Structure Syntax
Dating back to Perl, one of the major advantages that mainstream dynamic languages brought to the table over C++/Java is the succinct syntax for data structures. Data structures make up the backbone of data organization, and so the fact that they’re second-class syntactical denizens is really a shame. Despite having learned Lisp first (for Emacs hacking), my experience was that it was Perl’s succinct syntax which really revealed the power of having lists of maps of maps of lists of objects. The answer from the Object Oriented crowd was that each of those layers should have been layered and abstracted out as objects, but that gets really chatty really fast.
Thankfully, though, you can have your cake and eat it, too. With Ocaml, you have first-level data structures that provide the same level of power and succinctness as Perl or Ruby while still providing type safety.
Lists are the easiest to demonstrate, and the most familiar to the Ruby/Perl coder. They just look like this: ["item 1"; "item 2"; "item 3"]. Since those are all strings, Ocaml interprets this list to have a type of “string list”.
Experiment with Syntax
It’s a common conceit of dynamic language enthusiasts that the lack of type safety and fast-and-loose syntax maneuvers is actually a good thing — not just because it enables some patterns that aren’t viable a static language, but because it encourages exploration:
Lispers talk about Bottom-Up Programming. Well, dangerous features enable bottom-up language evolution. We discovered we like Symbol#to_proc because it bubbled up from the bottom. Someone invents something. If other people like it, they use it. The word gets around. People improve on it. Eventually it gains acceptance and becomes the de facto way to write code.This is true in all languages, but languages—like Ruby—that include dangerous features give the fringe a broader latitude to invent new things. Of course, they also break things and they invent stupid things and they get excited and write entire applications by patching core classes instead of writing new classes and commit all sorts of sin. (Raganwald, from (1..100).inject(&:+))
Ocaml provides the ability to screw around with syntax and experiment with new language constructs — in fact, it’s one of its key purposes for existing. It enables that through a meta-language system called “camlp4“/”camlp5“, which acts as a pre-processor to your source code. The advantage of this is that you are producing backwards compatible code with all the same guaranties and reliability as the rest of the language — once I compile my library, the fact that I used list comprehensions and string interpolation preprocessors doesn’t matter.
This is certainly a leap and a bound beyond global duck punching, where you need to know how every use of that module is going to behave, and then validate that it works out in all those cases. The reality in my experience is that people who pull off global duck punching tend to just kind of pray and hope their unit tests catch any bug they just introduced. This makes me unoptomistic about the maintainability of that code — and even the Ruby community is starting to agree with me (cite, cite).
Conclusion
If you have previously knocked static languages, I hope that you won’t trot out false statements like “static typing sucks because it doesn’t do duck typing” or “static typing slows down your development because you have to compile your code all the time”. If you were a static language fan who was clinging to Java/C#’s type system, hopefully you see that there’s a lot more out there than those languages allow.
This post started out as “5 of Ruby’s Greatest Hits in Ocaml”, and it’s a testament to the language that Ocaml stole the spotlight with things that Ruby can’t do at all. This is just the tip of the iceberg with functional programing and the way that it warps your mind, so I really hope you dig deeper into it.
Related posts:
Pingback: Enfranchised Mind » My Frustrations with REXML: Ruby’s Standard Library for Reading/Writing XML; or, Ruby’s Problem Is Its Type System, and Don’t Try to Tell Me Otherwise
Pingback: Enfranchised Mind » Interesting Conversations and a New Email Discipline
Pingback: Enfranchised Mind » Twitter and Blogging
Pingback: The Blog’s Most Popular Posts | Enfranchised Mind