Object interfaces in OCaml

2024-07-21

Our Running Example: Counters
Approach 1: Classes
Approach 2: Records
Approach 3: First-Class Modules
Conclusion
References

This post is about interfaces, or more precisely, interfaces with dynamic dispatch. We'll briefly cover why these can be useful but the main point is to explore our options in OCaml.

Interfaces specify how a software module is to be interacted with and hide implementation details. In some cases, an interface can even hide where the implementation is located. This is an example of dynamic dispatch, very common in object-oriented languages such as C++ (with virtual method) or Java.

Here are some common interfaces:

In Java, a Set can be implemented in different ways (e.g. tree set, hash set), but the user of the set doesn't need to know what implementation is used.
In Python, the interface to read files: io.RawIOBase. Actually, that object might represent a network stream, or any suitably-behaved sequence of bytes. The same sort of interface exists in many languages, including Java and Rust.

So, what about OCaml? Well, OCaml has modules and interfaces, but those are typically statically dispatched. What if we wanted to do dynamic dispatch, in an object-oriented way?

Our Running Example: Counters

To explore the different options offered by OCaml concisely, I'm afraid I'm going to have an artificial example, but hopefully not simplistic: counters.

A counter implementation starts from an initial value (defined by that implementation), can be bumped and can provide that value.

Below is how we are going to visualize our counter implementations. This command-line program should take one argument (integer or power) and print the first 10 values of the counter:

let () =
  if Array.length Sys.argv != 2 then (
    Printf.printf "usage: %s <integer|power>\n" Sys.argv.(0);
    exit 1);

  let counter = get_counter ~counter:Sys.argv.(1) in

  match counter with
  | None ->
      Printf.printf "Unknown counter: %s\n" Sys.argv.(1);
      exit 1
  | Some counter -> use_counter ~counter

We will later need to define get_counter and use_counter to print the right values.

In a terminal, we want to get something like the following, assuming our counter counts integers:

> ./count integer
0
1
2
3
4
5
6
7
8
9

Approach 1: Classes

This is the obvious approach if you're coming from an object-oriented language. After all, the "O" in OCaml stands for "Object", so this makes sense, right?

Yes, it's perfectly sensible, but for some reason, OCaml classes and objects are not widely used. Maybe this is because of the syntax, which is quite different from that of records, functions and sum types. Regardless, the result is that many developers, myself included, don't have a lot of experience with classes. That being said, let's see how they work.

Interface

Below is the definition of our very simple interface (get and bump).

class type counter = object
  method get : int
  method bump : unit
end

If you're familiar with OCaml, you may have already noticed two syntactic differences with the OCaml you may be used to:

We are defining a class type, not a type. It's a different kind of type; it can't be referenced in the same places.
Unlike a function, a method which doesn't need parameters can be defined without any parameter. A function would have needed at least one argument to work as a function (typically, unit is used in such a case).

Implementations

We define two types of counter:

"integer": This counter starts from zero and is incremented by each call to bump.
"power": This counter starts from one and doubles at each iteration, effectively going through all the powers of two (1, 2, 4, 8, etc).

Here are the corresponding classes:

module Integer : sig
  class t : counter
end = struct
  class t =
    object
      val mutable value = 0
    method get =
      value
    method bump =
      value <- value + 1;
    end
end

module Power : sig
  class t : counter
end = struct
  class t =
    object
      val mutable value = 1
    method get =
      value
    method bump =
      value <- value * 2;
    end
end

For each class, we have an internal state value and two implemented methods. This is probably very easy to understand if you're familiar with object-oriented programming.

Note

This implementation uses classes for convenience but this is not necessary. Unlike most languages with OOP capabilities, OCaml implicitly assigns a class to each object based on the signature of its methods. The correct use of objects is guaranteed via structural typing. In other words, we could have avoided class t and new, while still using class type and object.

Usage

And now we define the missing functions: (1) Instantiating the right kind of counter depending on the input argument, and (2) printing the first 10 numbers of the chosen counter.

let get_counter ~counter =
  match counter with
  | "integer" -> Some (new Integer.t)
  | "power" -> Some (new Power.t)
  | _ -> None

let use_counter ~counter =
  for _ = 0 to 9 do
    Printf.printf "%d\n" counter#get;
    counter#bump;
  done

In that use_counter function, as you can see, there is no indication that the counter is of one type or another. Actually, because of dynamic dispatch, not even the compiler can know for sure which implementation of get and bump will be used at runtime.

The same will be true in all the other examples. This is the point of using those interfaces.

Approach 2: Records

Records are widely used in OCaml. They are like structs in other languages such as C or Rust. When we say that OCaml has ADTs (algebraic data types), this is because it has sum types (e.g. type value = Int of int | Str of string) and product types (e.g. records and tuples).

Here we are going to combine records with first-class functions and lexical closures, two very typical features of any functional language. Since these concepts are so prevalent, this is probably how most OCaml developers would define our "counter" interface.

Interface

The functions are defined as fields in the record type. You might be wondering where we will be storing the state of the counter (its integer): we'll take care of that later. For now, all we want to expose is the interface (get and bump), so avoiding the exposure of the internal state here is actually good for us.

type record = { get : unit -> int; bump : unit -> unit }

Implementations

We define two "constructors": Integer.create and Power.create. Each constructor takes a dummy argument () and then mostly does two things:

Create a mutable integer variable.
Create a record of type record with functions referencing the integer variable.

When create () returns, the value goes out of scope so it cannot be used by any code except the two functions in the returned record. As such, it's really an internal state.

Every time create () is used, a new value is created just for one record.

Note

This works because the anonymous functions defined for get and bump capture their environment, which contains value, thanks to a lexical closure. As mentioned earlier, most languages with first-class functions have such closures.

module Integer : sig
  val create : unit -> record
end = struct
  let create () =
    let value = ref 0 in
    { get = (fun () -> !value); bump = (fun () -> value := !value + 1) }
end

module Power : sig
  val create : unit -> record
end = struct
  let create () =
    let value = ref 1 in
    { get = (fun () -> !value); bump = (fun () -> value := !value * 2) }
end

Usage

Using the counters defined above is rather straightforward, and not much different from what we did with objects in our first approach.

let get_counter ~counter =
  match counter with
  | "integer" -> Some (Integer.create ())
  | "power" -> Some (Power.create ())
  | _ -> None

let use_counter ~counter =
  for _ = 0 to 9 do
    Printf.printf "%d\n" (counter.get ());
    counter.bump ()
  done

Approach 3: First-Class Modules

Now let's have some fun. First-class modules may have their uses but they are definitely not my first choice when it comes to interfaces with dynamic dispatch. That being said, this post is a good opportunity to learn more about them.

I assume you know what an OCaml module is. The idea behind first-class module is similar to that behind first-class functions: using modules as values (e.g. as variables, function arguments).

Note

First-class modules are quite different from functors although they share some characteristics. The latter are much more common. For example, you will find functors such as Map.Make in the standard library.

Interface

As with all our earlier approaches, we first define our interface, this time it takes the form of a module type:

module type COUNTER = sig
  val get : unit -> int
  val bump : unit -> unit
end

Nothing first-class here, this is really typical module-wise.

Implementations

Complications start here, and you should soon understand why this approach is not so common. Have a look (explanations after the code):

module Integer : sig
  val create : unit -> (module COUNTER)
end = struct
  let create () =
    (module struct
      let value = ref 0
      let get () = !value
      let bump () = value := !value + 1
    end : COUNTER)
end

module Power : sig
  val create : unit -> (module COUNTER)
end = struct
  let create () =
    (module struct
      let value = ref 1
      let get () = !value
      let bump () = value := !value * 2
    end : COUNTER)
end

Like for records, we define constructors Integer.create and Power.create. The differences are the following:

The return type changes: record → (module COUNTER)
The creation of the container changes: { get = … } → (module struct let … end : COUNTER)
The integer state is stored inside the container instead of outside (and captured via a lexical closure). Actually, that was not a necessary change, but I can afford it because the state is hidden thanks to the COUNTER signature.

Note

The surrounding parentheses are mandatory in the items listed above. This is part of the distinct syntax for first-class modules.

Usage

We're not quite done yet. In fact, those modules are not that complicated, but they require this weird syntax.

In the following code, notice how we actually use the first-class module: by converting it from a value (counter) into an actual module (Counter):

let get_counter ~counter : (module COUNTER) option =
  match counter with
  | "integer" -> Some (Integer.create ())
  | "power" -> Some (Power.create ())
  | _ -> None

let use_counter ~(counter : (module COUNTER)) =
  let module Counter = (val counter : COUNTER) in
  for _ = 0 to 9 do
    Printf.printf "%d\n" (Counter.get ());
    Counter.bump ()
  done

That was the last example. I hope it was understandable.

Conclusion

You might be wondering which approach is the best. I'm afraid I really can't say, but my preferred approach is probably the second one (with records) because it uses features that pretty much any OCaml developer is familiar with.

Approach 1 (classes) is a close second though, and approach 3 comes last, in my opinion. That being said, your preference may be different, and you may encounter situations where one of these less common approaches is actually required, or at least more suitable.

Personally, I wonder if there are other solutions to this problem that I might have missed. Are there better ways to implement object interfaces? Or more convoluted approaches?

References

This article was inspired by a post in the OCaml forum. To fill the gaps in my understanding of objects and first-class modules, I used the excellent documentation on the official website and Real World OCaml.