Accidental memory allocation and loops are a particular challenge. Consider for example:

ocamlopt takes the code literally, and

let n = 100_000

let fib = Array.make n 1 ;;

for i = 2 to n-1 do

let a, b = fib.(i-2), fib.(i-1) in

fib.(i) <- a + b

done

`let a, b = ...`

allocates a tuple `(a, b)`

before discarding it. Simply rewriting the problematic statement as: makes the whole loop run in half the time.

let a = fib.(i-2) and b = fib.(i-1) in

The second problem with this loop is that we have lots of unnecessary references to the array fib. Each time we write

`fib.(i)`

, the compiler has to emit code to calculate `fib + i*sizeof(int)`

, and unlike C, ocamlopt doesn't use code motion to simplify and share the repeated references. We nearly halve the loop time again by accessing the array only once:

Overall these two simple optimizations reduce the total running time of this loop more than three-fold.

let a = ref 1 and b = ref 1 in

for i = 2 to n-1 do

let c = !a + !b in

a := !b; b := c; fib.(i) <- c

done

## 1 comment:

I only see 20% performance improvement for the second optimization on this AMD64. Not sure why...

Post a Comment