Jul 01 2008

Cedric’s Problem in OCaml

Published by Robert Fischer at 1:01 PM under To Be Categorized

From Otaku, Cedric’s weblog:

Here is an interesting coding challenge: write a counter function that counts from 1 to max but only returns numbers whose digits don’t repeat.

For example, part of the output would be:

* 8, 9, 10, 12 (11 is not valid)
* 98, 102, 103 (99, 100 and 101 are not valid)
* 5432, 5436, 5437 (5433, 5434 and 5435 are not valid)

Here’s an OCaml solution. Not terribly optimized, but good enough for free work. The solution is trivially changed into any base you’d like (even into calculating words with unique characters or whatever), and it calculates all the values from 1 to max_int in 27 seconds on my 2.16 GHz MacBook Pro.

(*
  Instead of thinking of these things in terms of number, 
  I'm thinking in terms of strings.
 
  Really, the question is: "how many unique strings of any length can you come up 
  with using the characters '0', '1', '2', '3', '4', '5', '6', '7', '8', and '9'", then capped
  at a certain point.
*)
 
let cap = max_int (* Set this to the cap you want *)
let cap_s = string_of_int cap
let cap_s_len = String.length cap_s
 
let handle_match str check =
  if check && String.length str >= cap_s_len && String.compare str cap_s > 1 then
    exit 0
  else () (* print_endline str *)
 
let filter_out lst x = List.filter (fun y -> String.compare x y != 0) lst
 
let symbols = List.map string_of_int [0;1;2;3;4;5;6;7;8;9]
let starting_symbols = filter_out symbols "0"
 
let rec build_of_length out len i syms =
  if i = len then handle_match out (len >= cap_s_len) else
  List.iter (
       fun str -> 
            out.[i] <- str.[0]; 
            build_of_length out len (i+1) (filter_out syms str)
   ) syms
 
let build_all_of_length len =
  List.iter
    (fun str ->
        let other_syms = filter_out symbols str
        in build_of_length (String.make len str.[0]) len 1 other_syms
    ) starting_symbols
 
let () =
  for i = 1 to cap_s_len+1 do
    build_all_of_length i
  done;
  ()

Popularity: 10% [?]

14 responses so far

14 Responses to “Cedric’s Problem in OCaml”

  1. Jason Kikelon 01 Jul 2008 at 3:27 PM

    Heres a ruby impl if you’re interested (i didn’t time it or anything):

    def doesntRepeat(val)
    val.to_s.split(”).inject({}) do |lookup,char|
    return false if lookup.has_key?(char)
    lookup[char] = nil
    lookup
    end
    true
    end

    (0..max).each { |i| puts i if doesntRepeat(i) }

  2. Robert Fischeron 01 Jul 2008 at 4:26 PM

    See the original links — Ruby solutions seem to be a dime a dozen. :)

  3. Robert Fischeron 01 Jul 2008 at 4:39 PM

    And for the record, I stopped that implementation after a minute and a half, even after ripping out the “puts” and just going “true if doesntRepeat(i)”.

  4. Cedric’s Coding challengeon 01 Jul 2008 at 5:30 PM

    [...] noticed Cedric’s Coding challenge while reading Robert Fishers blog. Here is an interesting coding challenge: write a counter function that counts from 1 to max but [...]

  5. Jason Kikelon 02 Jul 2008 at 5:02 AM

    Heh, well yeah its ruby :-)

    I’m excited to see the final output of John Lam and company on the IronRuby front. Ruby’s brevity coupled with a C#-like performance profile would make me a happy camper.

  6. pangoon 03 Jul 2008 at 5:51 PM

    My implementation is over 12 times faster on my box (with i/o disabled, ie let print_solution digits = ());
    With i/o enabled, it’s still pretty good (around 2.4s on my P4 2GHz):
    http://pastebin.be/12570

    Best regards, Pango

  7. Robert Fischeron 03 Jul 2008 at 11:02 PM

    @pango

    Where’s the huge speed-up come from? What am I doing wrong? Is string mangling just that much slower than list munging?

  8. Sanjeevon 04 Jul 2008 at 3:40 AM

    Couldn’t help not posting the Python solution :)

    def myrange(max):
    for val in xrange(1, max):
    if len(str(val)) == len(set(str(val))):
    print val,

  9. pangoon 04 Jul 2008 at 2:47 PM

    The main algorithms are very similar. I’m doing some optimizations, like only checking whether I reached the cap value for the numbers of the same length, however 90% of the generated numbers are in this case, so it shouldn’t help much.

    So the difference must indeed come the datastructures.

    Enumerated types (including “normal” integers) are the fastest values to work with in OCaml.

    Strings are slower because they’re allocated from the heap, however you don’t seem to allocate many of them (around 100?), so it should be ok. Maybe byte processing is a bit slow too, but I’m not sure.

    Lists are fast if you mostly use the efficient list operations: adding and removing elements at the “head” of lists. List.filter is not a very efficient list operation, the whole source list must be scanned, and a new list built from scratch (in fact if you look at the implementation it creates two lists, to be tail-recursive). So List.filter is O(n), and it’s being called a lot.

    Personally I’ve used an int as a bitfield to represent already used digits; Checking and setting a bit is O(1), and as an added benefit (versus, say, a bool array), it’s an immutable datastructure, which helps writing the exploration algorithm in a purely functional way. Using Sets would have been ok too for a purely functional implementation, with an intermediate complexity of O(log(n)), see the very nice version from Mauricio Fernandez.

    I suspect it’s the bitfield idea that makes my version faster. Checking the speed on an implementation that would use bitfields to represent already used digits and strings to represent numbers would be a good way to (in)validate that hypothesis.

    One last thing about i/o: if you want to display the results without slowing down your program too much, avoid using print_endline that flushes the output after each line (you can check the implementation in the Pervasives module). Instead, use Printf.printf “%s\n” or print_string plus print_char ‘\n’ to output solutions.

  10. Wheelwrighton 07 Jul 2008 at 1:11 PM

    Enumerable.Range(1, maxRange).Aggregate((total, n) =>
    {
    var chars = from c in n.ToString() select c;
    return total += (chars.Distinct().Count() == chars.Count()) ? n : 0;
    }));

    With no optimization it calculates all the values from 1 to max-int (maxRange = 32767) in 00.078 seconds on my crusty 512MB 3GHz Pentium 4.

    Another (albeit slightly slower) way to do it:

    Enumerable.Range(1, maxRange).Select(n =>
    {
    var chars = from c in n.ToString() select c;
    var count = chars.Count();
    return chars.Distinct().Count() == count ? n : 0;
    }
    ).Sum())

    And if anyone wonders I am using LINQ. Both queries are one-liners, line breaks added for clarity.

  11. Wheelwrighton 07 Jul 2008 at 2:15 PM

    Sorry, if you want printing this will do the job:

    for (int n = 0; n c).Distinct().Count() == n.ToString().Length)
    Response.Write(”" + n);

    Execution time increased to 00.093 secs

  12. Wheelwrighton 07 Jul 2008 at 2:17 PM

    Oh forget it, it got mangled because I included html break character

  13. Wheelwrighton 07 Jul 2008 at 2:32 PM

    Last time I swear, damn it is addictive:

    foreach (var n in Enumerable.Range(1, maxRange).Cast())
    Response.Write(n.Select(c => c).Distinct().Count() == n.Length ? n : “”);

  14. Mauricio Fernandezon 08 Jul 2008 at 8:37 AM

    Wheelwright: have you noticed that everybody else is using maxRange =
    10000000000 (or 4611686018427387903) as opposed to your maxRange = 32768?

    My fastest OCaml implementation uses functional sets represented with two lists
    (reminiscent of the classic functional queue).

    Here are the times I get on a 3GHz Athlon:

    Robert Fischer’s version: 2420ms
    pango’s version (no I/O): 569ms
    functional sets enc. as two lists: 273ms
    functional sets with manual inlining: 217ms

    (all versions capped to 10_000_000_000)

    You can find the code at http://eigenclass.org/misc/beust.ml and
    http://eigenclass.org/misc/beust2.ml.

Trackback URI | Comments RSS

Leave a Reply

Additional comments powered by BackType

Green Web Hosting! This site hosted by DreamHost.