Wednesday 7 May 2008

Optimizing, memory allocation and loops

Unlike C compilers such as gcc, ocamlopt doesn't do very much in the way of optimization. Really it compiles the program directly down to machine code pretty much the way you write it.

Accidental memory allocation and loops are a particular challenge. Consider for example:

let n = 100_000
let fib = Array.make n 1 ;;
for i = 2 to n-1 do
let a, b = fib.(i-2), fib.(i-1) in
fib.(i) <- a + b
done
ocamlopt takes the code literally, and let a, b = ... allocates a tuple (a, b) before discarding it. Simply rewriting the problematic statement as:

let a = fib.(i-2) and b = fib.(i-1) in
makes the whole loop run in half the time.

The second problem with this loop is that we have lots of unnecessary references to the array fib. Each time we write fib.(i), the compiler has to emit code to calculate fib + i*sizeof(int), and unlike C, ocamlopt doesn't use code motion to simplify and share the repeated references.

We nearly halve the loop time again by accessing the array only once:

let a = ref 1 and b = ref 1 in
for i = 2 to n-1 do
let c = !a + !b in
a := !b; b := c; fib.(i) <- c
done
Overall these two simple optimizations reduce the total running time of this loop more than three-fold.

1 comment:

Jon Harrop said...

I only see 20% performance improvement for the second optimization on this AMD64. Not sure why...