Object interfaces in OCaml
This post is about interfaces, or more precisely, interfaces with dynamic dispatch. We'll briefly cover why these can be useful but the main point is to explore our options in OCaml.
Interfaces specify how a software module is to be interacted with and hide implementation
details. In some cases, an interface can even hide where the implementation is located. In
such a case, we have a case of dynamic dispatch, very common in object-oriented languages
such as C++ (with virtual method
) or Java.
Here are some common interfaces:
- In Java, a Set can be implemented in different ways (e.g. tree set, hash set), but the user of the set doesn't need to know what implementation is used.
- In Python, the interface to read files: io.RawIOBase. Actually, that object might represent a network stream, or any suitably-behaved sequence of bytes. The same sort of interface exists in many languages, including Java and Rust.
So, what about OCaml? Well, OCaml has modules and interfaces, but those are typically statically dispatched. What if we wanted to do dynamic dispatch, in an object-oriented way?
Our Running Example: Counters
To explore the different options offered by OCaml concisely, I'm afraid I'm going to have an artificial example, but hopefully not simplistic: counters.
A counter implementation starts from an initial value (defined by that implementation), can be bumped and can provide that value.
Below is how we are going to visualize our counter implementations. This command-line
program should take one argument (integer
or power
) and print the first 10 values of
the counter:
let () = if Array.length Sys.argv != 2 then ( Printf.printf "usage: %s <integer|power>\n" Sys.argv.(0); exit 1); let counter = get_counter ~counter:Sys.argv.(1) in match counter with | None -> Printf.printf "Unknown counter: %s\n" Sys.argv.(1); exit 1 | Some counter -> use_counter ~counter
We will later need to define get_counter
and use_counter
to print the right values.
In a terminal, we want to get something like the following, assuming our counter counts integers:
> ./count integer 0 1 2 3 4 5 6 7 8 9
Approach 1: Classes
This is the obvious approach if you're coming from an object-oriented language. After all, the "O" in OCaml stands for "Object", so this makes sense, right?
Yes, it's perfectly sensible, but for some reason, OCaml classes and objects are not widely used. Maybe this is because of the syntax, which is quite different from that of records, functions and sum types. Regardless, the result is that many developers, myself included, don't have a lot of experience with classes. That being said, let's see how they work.
Interface
Below is the definition of our very simple interface (get
and bump
).
class type counter = object method get : int method bump : unit end
If you're familiar with OCaml, you may have already noticed two syntactic differences with the OCaml you may be used to:
- We are defining a
class type
, not atype
. It's a different kind of type; it can't be referenced in the same places. - Unlike a function, a method which doesn't need parameters can be defined without any
parameter. A function would have needed at least one argument to work as a function
(typically,
unit
is used in such a case).
Implementations
We define two types of counter:
- "integer": This counter starts from zero and is incremented by each call to
bump
. - "power": This counter starts from one and doubles at each iteration, effectively going through all the powers of two (1, 2, 4, 8, etc).
Here are the corresponding classes:
module Integer : sig class t : counter end = struct class t = object val mutable value = 0 method get = value method bump = value <- value + 1; end end module Power : sig class t : counter end = struct class t = object val mutable value = 1 method get = value method bump = value <- value * 2; end end
For each class, we have an internal state value
and two implemented methods. This is
probably very easy to understand if you're familiar with object-oriented programming.
This implementation uses classes for convenience but this is not necessary. Unlike most
languages with OOP capabilities, OCaml implicitly assigns a class to each object based on
the signature of its methods. The correct use of objects is guaranteed via structural
typing. In other words, we could have avoided class t
and new
, while still using
class type
and object
.
Usage
And now we define the missing functions: (1) Instantiating the right kind of counter depending on the input argument, and (2) printing the first 10 numbers of the chosen counter.
let get_counter ~counter = match counter with | "integer" -> Some (new Integer.t) | "power" -> Some (new Power.t) | _ -> None let use_counter ~counter = for _ = 0 to 9 do Printf.printf "%d\n" counter#get; counter#bump; done
In that use_counter
function, as you can see, there is no indication that the counter is
of one type or another. Actually, because of dynamic dispatch, not even the compiler can
know for sure which implementation of get
and bump
will be used at runtime.
The same will be true in all the other examples. This is the point of using those interfaces.
Approach 2: Records
Records are widely used in OCaml. They are like structs in other languages such has C or
Rust. When we say that OCaml has ADTs (algebraic data types), this is because it has sum
types (e.g. type value = Int of int | Str of string
) and product types (e.g records and
tuples).
Here we are going to combine records with first-class functions and lexical closures, two very typical features of any functional language. Since these concepts are so prevalent, this is probably how most OCaml developers would define our "counter" interface.
Interface
The functions are defined as fields in the record type. You might be wondering where we
will be storing the state of the counter (its integer): we'll take care of that later. For
now, all we want to expose is the interface (get
and bump
), so avoiding the exposure
of the internal state here is actually good for us.
type record = { get : unit -> int; bump : unit -> unit }
Implementations
We define two "constructors": Integer.create
and Power.create
. Each constructor takes
a dummy argument ()
and then mostly does two things:
- Create a mutable integer variable.
- Create a record of type
record
with functions referencing the integer variable.
When create ()
returns, the value
goes out of scope so it cannot be used by any code
except the two functions in the returned record. As such, it's really an internal state.
Every time create ()
is used, a new value
is created just for one record.
This works because the anonymous functions defined for get
and bump
capture their
environment, which contains value
, thanks to a lexical closure. As mentioned earlier,
most languages with first-class functions have such closures.
module Integer : sig val create : unit -> record end = struct let create () = let value = ref 0 in { get = (fun () -> !value); bump = (fun () -> value := !value + 1) } end module Power : sig val create : unit -> record end = struct let create () = let value = ref 1 in { get = (fun () -> !value); bump = (fun () -> value := !value * 2) } end
Usage
Using the counters defined above is rather straightforward, and not much different from what we did with objects in our first approach.
let get_counter ~counter = match counter with | "integer" -> Some (Integer.create ()) | "power" -> Some (Power.create ()) | _ -> None let use_counter ~counter = for _ = 0 to 9 do Printf.printf "%d\n" (counter.get ()); counter.bump () done
Approach 3: First-Class Modules
Now let's have some fun. First-class modules may have their uses but they are definitely not my first choice when it comes to interfaces with dynamic dispatch. That being said, this post is a good opportunity to learn more about them.
I assume you know what an OCaml module is. The idea behind first-class module is similar to that behind first-class functions: using modules as values (e.g. as variables, function arguments).
First-class modules are quite different from functors although their
share some characteristics. The latter are much more common. For example, you will find
functors such as Map.Make
in the standard library.
Interface
As with all our earlier approaches, we first define our interface, this time it takes the
form of a module type
:
module type COUNTER = sig val get : unit -> int val bump : unit -> unit end
Nothing first-class here, this is really typical module-wise.
Implementations
Complications start here, and you should soon understand why this approach is not so common. Have a look (explanations after the code):
module Integer : sig val create : unit -> (module COUNTER) end = struct let create () = (module struct let value = ref 0 let get () = !value let bump () = value := !value + 1 end : COUNTER) end module Power : sig val create : unit -> (module COUNTER) end = struct let create () = (module struct let value = ref 1 let get () = !value let bump () = value := !value * 2 end : COUNTER) end
Like for records, we define constructors Integer.create
and Power.create
. The
differences are the following:
- The return type changes:
record
→(module COUNTER)
- The creation of the container changes:
{ get = … }
→(module struct let … end : COUNTER)
- The integer state is stored inside the container instead of outside (and captured via a
lexical closure). Actually, that was not a necessary change, but I can afford it because
the state is hidden thanks to the
COUNTER
signature.
The surrounding parentheses are mandatory in the items listed above. This is part of the distinct syntax for first-class modules.
Usage
We're not quite done yet. In fact, those modules are not that complicated, but they require this weird syntax.
In the following code, notice how we actually use the first-class module: by converting it
from a value (counter
) into an actual module (Counter
):
let get_counter ~counter : (module COUNTER) option = match counter with | "integer" -> Some (Integer.create ()) | "power" -> Some (Power.create ()) | _ -> None let use_counter ~(counter : (module COUNTER)) = let module Counter = (val counter : COUNTER) in for _ = 0 to 9 do Printf.printf "%d\n" (Counter.get ()); Counter.bump () done
That was the last example. I hope it was understandable.
Conclusion
You might be wondering which approach is the best. I'm afraid I really can't say, but my preferred approach is probably the second one (with records) because it uses features that pretty much any OCaml developer is familiar with.
Approach 1 (classes) is a close second though, and approach 3 comes last, in my opinion. That being said, your preference may be different, and you may encounter situations where one of these less common approaches is actually required, or at least more suitable.
Personally, I wonder if there are other solutions to this problem that I might have missed. Are there better ways to implement object interfaces? Or more convoluted approaches?
References
This article was inspired by a post in the OCaml forum. To fill the gaps in my understanding of objects and first-class modules, I used the excellent documentation on the official website and Real World OCaml.