Accidental memory allocation and loops are a particular challenge. Consider for example:
ocamlopt takes the code literally, and
let n = 100_000
let fib = Array.make n 1 ;;
for i = 2 to n-1 do
let a, b = fib.(i-2), fib.(i-1) in
fib.(i) <- a + b
done
let a, b = ... allocates a tuple (a, b) before discarding it. Simply rewriting the problematic statement as: makes the whole loop run in half the time.
let a = fib.(i-2) and b = fib.(i-1) in
The second problem with this loop is that we have lots of unnecessary references to the array fib. Each time we write
fib.(i), the compiler has to emit code to calculate fib + i*sizeof(int), and unlike C, ocamlopt doesn't use code motion to simplify and share the repeated references. We nearly halve the loop time again by accessing the array only once:
Overall these two simple optimizations reduce the total running time of this loop more than three-fold.
let a = ref 1 and b = ref 1 in
for i = 2 to n-1 do
let c = !a + !b in
a := !b; b := c; fib.(i) <- c
done
1 comment:
I only see 20% performance improvement for the second optimization on this AMD64. Not sure why...
Post a Comment