Functional (Meta)?Programming Stunts for Ruby and Groovy (and a Little Perl)

After I learned OCaml, my coding mindset was totally distorted. I started writing Java code that looked like this:

  public Collection<Foo> getCertainFoos() {
    return
      CollectionUtils.select(getFoos(), new Predicate() { 
         public boolean evaluate(Object o) {
            return SOME_CONSTANT.equals(((Foo)o).getProperty());
         }
      });
  }

This is kinda ugly in Java, but it’s simply what comes out when I was thinking this in OCaml:

List.find_all (fun i -> SOME_CONSTANT = i#getProperty()) #getFoos()

I also started slapping final everywhere — see Yet Another Reason final Is Your Friend. A ubiquitous use of final actually gave some nice patterns (in the “macro” sense of patterns), but raised all kinds of eyebrows and made my code unmistakable. This lead up to a unique coding style which you can see in my most involved open source project, JConch. Meanwhile, my co-blogger was talking about “The Hole in the Middle” Pattern, which is also a lay-up within FP circles but required some backflips to implement in Java (functional interfaces) and C# (delegates).

It wasn’t until the advent of Ruby and Groovy, though, that functional programming skills really became easier to use. Basically, because of the inline closure’s succinct syntax (and ability to access non-final variables), I could suddenly do all kinds of really fun stuff. This “fun stuff” was exactly the kind of stunts I was pulling in Perl back in the day (see the BEGIN block in my Text::Shift code for a reasonably accessible example), and it was part of the reason I loved Perl so much at the time.

So, I thought I’d share some more of these cute stunts with you.

Callbacks

This is probably the most obvious use of functional programming ideas. The basic idea is to have a bunch of “trigger” code that executes when some interesting circumstance is executed.

Ruby and Groovy people are probably most familiar with using this in the context of ORM, so here is a quick example from ActiveRecord and GORM:

# Ruby (ActiveRecord)
# Shamelessly copied from http://api.rubyonrails.org/classes/ActiveRecord/Callbacks.html
  class CreditCard < ActiveRecord::Base
    # Strip everything but digits, so the user can specify "555 234 34" or
    # "5552-3434" or both will mean "55523434"
    def before_validation_on_create
      self.number = number.gsub(/[^0-9]/, "") if attribute_present?("number")
    end
  end
// Groovy (GORM)
// Shamelessly copied from http://www.grails.org/GORM+-+Events
class Person {
   String name
   def beforeDelete = {
      new ActivityTrace(eventName:"Person Deleted",data:name).save()
   }
}

However, that’s not the only time that callbacks are useful. A particularly useful case I find is for error handling: often, error handling violates layers of abstraction, because you’d like something to happen on the UI in response to an exception down in the code. The answers I’ve seen range from a Rube Goldberg-style chaining of try/catch blocks to a static ErrorHandler class which then sat right on top of the UI and was called from wherever. By passing in some code to get executed on error, layers deep into your code can respond to errors using things they don’t know about and don’t have access to. Plus, this allows you to define your error handling code at the top-level (which cares about it), and not worry about it elsewhere.

# Ruby
def topLevel 
   error = nil
   onErr = Proc.new { |err| puts "In onErr"; puts error = err.message }
   recurse(onErr)
   if(error)
      puts "Oh nos!  #{error}" 
   else
      puts "Yay!"
   end
end
 
def recurse(onErr)
    begin
        (0..10).each do |i| puts (210/i) end
    rescue
        onErr.call($!)
    end
end

Note that Rails 2.0 apparently jumped on this approach, too: here is a nice example.

Continuation Passing Style

This is one of those big, scary sounding words for something that’s very simple. Basically, think about times when it’s more natural to tell an object, “Oh, hey, when you’re done doing that — do this stuff.”

I had a classic example of this in a codebase I’m currently working in for Groovy. Basically, the story is that I fire off a thread to do a lot of complex HTTP work — content negotiation for compression, redirection handling, all that jazz. I encapsulated that work into an object called a FeedFetcher. The FeedFetcher took, as a second argument, a closure which its result was passed into.

            // Groovy
            updatePollDate(feedId)
            def urlObj = urlForId(feedId)
            new FeedFetcher(urlObj) { body ->
                try {
                    updateFeed(Feed.findById(feedId), body)
                } catch(Exception e) {
                    errHandle(e)  // Exception callback (see above)
                }
            }

So, although it nicely reads like everything happens inline, there’s actually a whole ton of work being done when I call new FeedFetcher(urlObj). In actuality, the closure is even going to be executed in a distinct thread from the code two lines above it! The FeedFetcher throws off a new thread, and that thread calls the closure, and suddenly this new thread is doing all this additional work it didn’t know about. Yet the FeedFetcher doesn’t need to know anything about what I want to do with the body — it is coded up with the single-minded intent of getting the body and handing it off to something else to process. This keeps my code focused and limited, and prevents me from having to pass in parameters (or, worse, a context map) with things the FeedFetcher doesn’t care about.

Private (No, Really, I Mean It) Variables

One of the things that Perl, Ruby, and Groovy have in common is that instance variables just can’t hide that well. And, normally, this is just how the language rolls. But you sometimes really want to enforce a strict policy of single access: only one method gets to touch a variable.

This is a perfect place for closures.

# Ruby
def secretPropMaker
  a = nil
  Proc.new {|*args| if args.length > 0 then a = args[0] else a end }
end
 
class X
  define_method(:a, secretPropMaker())
end
 
x = X.new()
x.a("Hello!")
puts x.a()

if you wanted to extend this to be more like Module#attr, you certainly could build off of what I have here to do that.

This is also a good way to generate getters and setters in Groovy, which doesn’t allow you to create new instance variables on the fly like Ruby does.

// Groovy
class X {}
 
def propMaker() {
    def a = null
    return [ {-> a}, {i -> a = i} ]
}
 
def accs = propMaker()
X.metaClass.getA = accs[0]
X.metaClass.setA = accs[1]
 
def x = new X()
x.a = "Hello!"
println x.a

Function Templates

If you have a number of functions which differ in slight (but important) ways, hold onto a function template and curry in the values. This is what I did in that Text::Shift example above. The basic idea is that you define a general form of the function by parametrizing the differing parts, and the new functions are created by binding those parameterized parts.

This is easiest to see in action in some OCaml code I wrote:

(* OCaml *)
open ANSITerminal
let msg style = fun str -> (print_newline (); print_string style str; flush_all ())
let normal_msg = msg []
let error_msg = msg [Bold; red] 
let admin_msg = msg [blue]
let prompt_msg = msg [yellow; on_blue]

So, what I have here is a method template that generates methods which print a newline, then a string with a given terminal style, and then flushes all the output. This enables me to have the same behavior for all my different kinds of messages, and I can create new message types by simply specifying the style.

Here’s a similar stunt to my OCaml code, done in Groovy, which wraps strings in some kind of before-and-after delimiter.

// Groovy
class X {
  static {
    def delimit  = { before, after, str -> return before + str + after }
    X.metaClass.parens << delimit.curry("(", ")")
    X.metaClass.bracket << delimit.curry("[", "]")
    X.metaClass.dash << delimit.curry("-", "-") 
  }
}
new X().parens("Hello") // Prints "(Hello)"

Ruby doesn’t natively support a curry method, although I think this thread says it’s coming in 1.9. It’s easy to write such a thing, and if you’re using the extensive Ruby Facets library, you don’t have to write a thing (although I’m not a fan of the “__” action). (Hat-tip to Raganwald for the facets lead.)

So, all that said, here’s the Ruby code to do the same thing as the Groovy code above.

# Ruby
require 'facets'
 
class X
  delimit  = Proc.new { |before, after, str| before + str + after }
 
  define_method(:parens, delimit.curry("(", ")", __))
  define_method(:bracket, delimit.curry("[", "]", __))
  define_method(:dash, delimit.curry("-", "-", __))
end
 
puts X.new().parens("Hello") # Prints "(Hello)"

And here’s the Text::Shift example from before. Basically, all I’m doing is creating a caller-specific getter/setter, so distinct packages have their own global variables. And I do this for three different functions.

    # Perl
    my $funcref = sub( $ ) {
	my $index = shift;
	return sub($$) {
	    (undef, my $caller) = (shift, caller);
	    if(scalar(@_)) {
		# Modifier return
		$_abs{$caller} = [] unless($_abs{$caller});
		return $_abs{$caller}->[$index] = join("",@_);
	    } else {
		# Accessor return
		return $_abs{$caller}->[$index];
	    }
	};
    };
 
    *uppercase = $funcref->(UPPDEX);
    *lowercase = $funcref->(LOWDEX);
    *numbers   = $funcref->(NUMDEX);

This would be a particularly useful as an alternative to eval action based off of methodMissing. In Groovy, it looks like this:

// Don't ask, but you have to do this to get the cache
// http://enfranchisedmind.com/blog/2008/06/26/groovy-metaclass-bug/
ExpandoMetaClass.enableGlobally()
 
// Groovy
class X {
  private static def says = { who, what -> println "$who says: $what" }
 
  def methodMissing(String name, args) {
    println "In method missing!"
    def sezMe = says.curry(name)
    X.metaClass."$name" << sezMe
    sezMe(args[0])
  }
 
 }
 
def x = new X()
x.robert("Hey there!")
x.robert("Ho there!")

Let The System Handle The Clean-Up

The whole idea of having to explicitly close my resources has always annoyed me. It’s buggy (“Did I close this already?”), violates levels of abstraction (“Am I responsible for closing this, or are they?”), and generally is more hassle than it’s worth.

So, stop it. Tell the system how to clean up your stuff, and let it deal with the janitorial work.

The broadest stroke is to close things at exit.

# Ruby
handle = ...
at_exit { handle.close() }
// Groovy
def handle = ...
def r = { -> handle.close() } as Runnable
Runtime.runtime.addShutdownHook(new Thread(r))

Better, though, would be to clean up things when the object is garbage collected. OCaml provides an external finalization hook:

(* OCaml *)
(* General usage *)
Gc.finalise cleanup x
 
(* A more realistic example. *)
Gc.finalise close_out_noerr my_out_chan

Ruby can do this, too.

# Ruby
ObjectSpace.define_finalizer(handle) { handle.close() }

I don’t think Groovy offers this (yet) — Java’s solution has been to extend the class and then override #finalize, but that’s an awkward solution (and worthless for things like String, which is final). Java can fake the external finalizer using phantom references — see the source for Commons-IO’s FileCleaningTracker as an example. I thought there was a utility in Commons-Lang which encapsulated the phantom reference tracking and background thread, but I couldn’t find it.


That concludes the tour of some interesting functional language stunts, which apparently pushes boundaries on both Ruby and Groovy. If you’d like more examples or more clarity, leave a comment and let me know. I’m very interested to hear what other people’s thoughts are — in particular, where have you found the functional programming style to be particularly clean on your code?

This entry was posted in Groovy, Perl, Ruby/JRuby and tagged , , , , . Bookmark the permalink. Post a comment or leave a trackback: Trackback URL.

5 Comments

  1. Posted June 24, 2008 at 1:58 PM | Permalink

    I was going to do mocking, but that works a lot better for Groovy than for Ruby thanks to the dot operator conflating map key/object property access and the more succinct closure calling syntax. So that’s going to get its own post in the near future.

  2. Posted June 24, 2008 at 4:56 PM | Permalink

    I can’t let the comment “explicitly close my resources has always annoyed me” go by without my own comment.

    Resource ownership has semantic meaning. It’s not just about memory and type safety, where GC has a good application without affecting semantic meaning – a theoretical GC with infinite memory that never freed any memory would be perfectly fine. On the other hand, closing a file handle has a concrete, observable, global effect on the system – depending on the sharing mode / OS, you can’t reopen or delete an already-open file; there may be unflushed buffers in your RTL’s IO code before final closing; etc. Similarly, closing a socket will be observed by the other end; and there are analogues for most OS resources that a program can acquire.

    Leaving resource cleanup to a GC via finalization is in almost all cases very bad practice.

    GCs aren’t aware of the systemic effects of the various resources whose lifetimes you have associated with arbitrary objects, so these calculations won’t enter into when it considers it opportune to perform a GC. And besides, in certain cases the GC may fail to invoke finalizers for your objects altogether.

    Resources need deterministic cleanup. If you find it difficult to work out ownership semantics, consider reference counting (perhaps via a handle object) or an API protocol. For example, in C# I often use an enumeration called ResourceOwnership with two values, Preserve and Transfer. Any API that takes a resource value that has a semantic ambiguity of ownership takes a value of this enumeration type. I blogged about it some time ago (http://barrkel.blogspot.com/2006/08/specfiying-ownership.html).

    Just say no to finalization. It’s (almost certainly) not the answer you’re looking for.

  3. Posted June 25, 2008 at 5:17 AM | Permalink

    Sure, there are cases when deterministic resource clean-up is important — namely, when it’s important to know a resource has been cleaned up at a particular time. Using the GC to close resources which you may want to access again in the same run is simply a bad plan.

    However, there’s a lot of cases where deterministic resource clean-up isn’t all that important. Take, for instance, libcurl’s API, which requires global init and global clean-up to be done. The OCaml runtime will actually do a *better* job of running the global clean-up reliably than my own code will, since it has a chance to respond to things like signals which my code doesn’t address. So what’s wrong with using it there?

  4. Posted June 25, 2008 at 5:56 AM | Permalink

    Oh, and flushing sockets/channels is something that should certainly be done deterministically: don’t leave floaters in the pipe.

  5. Posted June 26, 2008 at 6:56 AM | Permalink

    Wordie of it. ‘Coz why not?



One Trackback

  1. By roScripts - Webmaster resources and websites on June 25, 2008 at 7:15 PM

    Functional (Meta)?Programming Stunts for Ruby and Groovy (and a Little Perl)…

    Functional (Meta)?Programming Stunts for Ruby and Groovy (and a Little Perl)…

Post a Comment

Your email is never published nor shared. Required fields are marked *

*
*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong> <pre lang="" line="" escaped="">