Accidental memory allocation and loops are a particular challenge. Consider for example:
ocamlopt takes the code literally, and
let n = 100_000
let fib = Array.make n 1 ;;
for i = 2 to n-1 do
let a, b = fib.(i-2), fib.(i-1) in
fib.(i) <- a + b
done
let a, b = ...
allocates a tuple (a, b)
before discarding it. Simply rewriting the problematic statement as: makes the whole loop run in half the time.
let a = fib.(i-2) and b = fib.(i-1) in
The second problem with this loop is that we have lots of unnecessary references to the array fib. Each time we write
fib.(i)
, the compiler has to emit code to calculate fib + i*sizeof(int)
, and unlike C, ocamlopt doesn't use code motion to simplify and share the repeated references. We nearly halve the loop time again by accessing the array only once:
Overall these two simple optimizations reduce the total running time of this loop more than three-fold.
let a = ref 1 and b = ref 1 in
for i = 2 to n-1 do
let c = !a + !b in
a := !b; b := c; fib.(i) <- c
done
1 comment:
I only see 20% performance improvement for the second optimization on this AMD64. Not sure why...
Post a Comment