Tuesday, 16 September 2008

Tip: Read all lines from a file (the most common OCaml newbie question?)

Is this the most common OCaml beginners question? It comes up every few weeks on the OCaml beginners list, and I have tried to answer it before.

It seems that everyone who learns OCaml comes away with the impression that functional programming is the New Cool Thing, and imperative programming is Bad and Must Be Avoided.

I'm going to say it now: programming fashions are stupid and counterproductive. The only things that matter are that your program is short, easy to write, easy to maintain and works correctly. How you achieve this has nothing to do with programming fads.

Reading all lines from a file is an imperative problem, and the shortest solution (easy to write, easy to maintain and correct1) uses a while loop, in OCaml or any other language:
let lines = ref [] in
let chan = open_in filename in
try
while true; do
lines := input_line chan :: !lines
done; []
with End_of_file ->
close_in chan;
List.rev !lines
Actually, no, I'm lying. The best solution is this:
Std.input_list chan
which is supplied by extlib. Don't bother to duplicate functions which are already provided in commonly available libraries.


1This is only strictly speaking correct if you handle clean-up if input_line throws some read error (exception). In the common case where you just exit the program, leaving the channel open is perfectly acceptable.

5 comments:

ivazqueznet said...

Actually, Python uses a for loop:

f = open('file.txt')
for line in f:
  dosomethingwith(line)
f.close()

Or you can use f.readlines() if you're feeling wild and woolly.

call said...

>I'm going to say it now: programming fashions are stupid and counterproductive. The only things that matter are that your program is short, easy to write, easy to maintain and works correctly. How you achieve this has nothing to do with programming fads.
Wisest thing I've heard(read) in a while :-)
Thank you for this meaningful post [^_^]

yoric said...

Actually, with the patch which I hope makes its way to ExtLib, the Python extract becomes

open Enum
open IO
open File

with_file_in "file.txt" (fun f ->
iter (fun line ->
dosomething line
) (lines_of f)
)

The extract will also take care of closing the file at the end and/or in case of exception.

ivazqueznet said...

Oh sure, if you want to go that way...

from __future__ import with_statement

with open('file.txt') as f:
  for line in f:
    dosomethingwith(line)

yoric said...

Let's keep playing :)

With the same patch.

open Enum
open IO
open File

with_file_in "file.txt" (
lines_of |- iter do_something
)

or, if you prefer

with_file_in "file.txt" (
iter do_something -| lines_of
)

or, if you prefer

lines_of |- iter do_something |>
with_file_in "file.txt"

Although I grant you that this concision comes at the expense of readability, something which the Python version manages to avoid.

Of course, I can cheat a little further and use a nice little syntax extension of mine and end up with

with file_in "file.txt" as f
for line in lines_of f
do_something line

Which somehow reminds me of the Python version :)