Update: I fixed the formating. Also, a half-completed version of this article got posted to the front page when it wasn’t supposed to, it got deleted as well.
Now for some real meat. I’ve come up with a neat algorithm for sorting lazy lists. I’m not sure if it’s original or not, but it’s a neat enough trick I thought I’d pass it on. It has a couple of interesting properties: forcing the first element of the sorted list costs only O(N) time/space complexity. Forcing each additional element costs only O(log N) time/space complexity. Which means if you force the entire list, it’s your standard O(N log N) complexity sort, but if you only need the first k+1 elements of the sorted list, it’s O(N + k log N), which (for small k) is signifigantly less than doing a full sort. This is important in a lazy environment where we may not need the entire list sorted.
I’m assuming you’re familiar with Ocaml and have read my introduction to lazy lists earlier today.
The algorithm is a variant on merge sort (which makes me kind of suspect it’s not original with me). It’s very easy to merge two sorted lists into one big sorted list, the code for doing this to lazy lists is similiar:
let rec merge cmp lst1 lst2 = lazy ( match (Lazy.force lst1), (Lazy.force lst2) with | Empty, e | e, Empty -> e | Node(h1, t1), Node(h2, t2) -> if (cmp h1 h2) <= 0 then Node(h1, merge cmp t1 lst2) else Node(h2, merge cmp lst1 t2) );;
An important thing to notice about the lazy merge is that all that calling it does is create a lazy thunk- no actual work is done at that point. Forcing each successive element, however, requires a comparison to be done to determine which list the element came out of- which implies forcing the first element of each of the sub lists. Why this is important will become obvious in a bit.
The normal implementation of merge sort is top-down. You take the whole list, split it into two, recursively sort the two halves, and then merge the results back together. This variation is instead bottom up. I convert the original lazy list of elements into a (non-lazy) list of 1-element long lazy lists:
let deepen lst = let rec loop accum = function | Empty -> accum | Node(h, t) -> loop ((singleton h) :: accum) (Lazy.force t) in loop [] (Lazy.force lst) ;;
Note that I don’t bother to do the final reverse on this list, and that the top level list is non-lazy. Why this is so will become apparent in a moment. I then go through and start merging pairs of these one-element lazy lists into two-element lazy lists:
let combine_pairs cmp reversed lst = let rec loop accum = function | [] -> accum | [ h ] -> h :: accum | x :: y :: t -> if (reversed) then loop ((merge cmp y x) :: accum) t else loop ((merge cmp x y) :: accum) t in loop [] lst ;;
This is where I deal with the list being backwards- if the list is backwards, I just reverse the order the arguments are passed into merge. This prevents me from having to be constantly building up backwards lists and then reversing them. It also has the interesting effect of making the sort function stable- elements which compare equal are left in the order they occurred in the original list. I keep calling this function, merging pairs of lists, until I have only one list left:
let reduce cmp lst = let rec loop reversed = function | [] -> assert false | [ res ] -> res | lst -> loop (not reversed) (combine_pairs cmp reversed lst) in if (lst == []) then lazy Empty else loop true lst ;;
Note that we assume the list originally passed into this routine is reversed (and that each successive call to combine_pairs reverses it again)- which is why we didn’t bother to reverse it in deepen. Also, the reason we made a non-lazy list of lazy lists also becomes more apparent. We consume the list of lists as soon as we produce it, so we gain nothing from making it lazy, and lose due to the space and cost penalty of lazy evaluation.
The sort routine itself is just a composition of deepen and reduce:
let sort cmp lst = lazy (Lazy.force (reduce cmp (deepen lst))) ;;
The lazy (Lazy.force ...) trick is so that the functions reduce and deepen aren’t called until the first element of the list is forced. This way, if no elements of the list are needed, the sorting is O(1).
Obviously, the resultant list is sorted. But what is the performance of this algorithm? The deepen function is obviously O(N), it’s a simple map. But what about reduce? The first time through the list of lazy lists takes N steps, each step an O(1) cost merge, since we don’t force any elements. The second time through, however, the list is only have as long, so it takes N/2 steps. The third time through N/4, and so on. So the total number of steps taken is 2N to produce the single, combined list.
The interesting thing is when we force the first element of the resultant list of reduce. This is a N-element long list which is the result of merging two N/2 element lists, so forcing the first element forces the first elements of these 2 lists. Forcing the first elements of these two lists in turns forces the first elements of the 4 N/4 element lists, not unlike dominoes falling. Forcing the first elements of those 4 N/4 element lists in turn forces the first elements of the 8 N/8 element lists, and so on, until we get down to forcing the first elements of the N 1-element lists produced by deepen. At which point we bottom out, and comparisons happen until we get back up to the top, and have our first element. The total number of steps here is, again, 1 + 2 + 4 + 8 + … + N, or 2N steps. So the total cost of forcing that first element is O(N).
Now, what happens when we force the second element of the total list? Forcing the second element of the whole list means we force the first element of an N/2 and a N/2 -1 element lists. But note- the first element of the N/2 element list has already been forced once before, when we forced the previous element. So we just get the cached result of that computation, which is an O(1) cost. It’s the N/2 – 1 element list that is interesting- forcing the first element (actually the second element of the original N/2 element list) in turn forces the first elements of a N/4 and N/4 – 1 element lists. But again, the N/4 element list has already been forced once. So we end up forcing only log2 N total new elements, so it takes only log N steps. In the same way forcing each additional element only takes at most O(log N) steps.
So forcing the entire N elements of the list takes O(N) time for the first element plus O((N – 1) * log N) elements to force the N-1 successive elements, for a total cost of O(N log N).
Note that a variant of this algorithm also works for randomly ordering a list in O(N log N) time, in a purely functional manner. If we changed our merge to use Random.bool instead of the passed in compare function, we’d have a function which quite nicely randomizes a list. This is usefull for, for example, shuffling a programmatic deck of cards.
One Comment
Great idea!
You could write it using camlp4′s lazy stream parsing syntax as well. Might be a little more elegant…
One Trackback
[...] “Now for some real meat. I’ve come up with a neat algorithm for sorting lazy lists. I’m not sure if it’s original or not, but it’s a neat enough trick I thought I’d pass it on.” (Sorting Lazy Lists) [...]