Leaving OCaml

Part of a 3 part series. Followups on F#, Rust

I built the first demo of Dark in Python, in about two weeks. A few months later when I started productizing it, I rebuilt it in OCaml. Back in 2017, when I was considering the language and platform to use for Dark, OCaml was extremely compelling:

  • it's a high-level language with static types, so easy to make large scale changes as we figure out what the language/product was
  • you mostly model data with sum types, which in my mind are the best way to model data
  • it's very similar to the language I wanted to build (in particular, we could reuse built-in immutable data structures for Dark's values)
  • it had a reputation for being high-performance, which meant that we could write an interpreter for Dark and not have it be terribly slow (vs writing an interpreter in python, which might be too slow)

Unfortunately, as we've built Dark we've run into significant problems that have made it challenging to build in OCaml.

Lack of libraries

When you bet on an off-mainstream language, one of the things you accept is that many libraries are not going to be available. When there is a small community, often there aren't enough people working in the language to make important libraries. This is especially true if few people are building business applications.

In OCaml there are many high quality libraries, especially for data structures and data manipulation. The annual Jane Street code dump has been quite useful and very high quality. However, we really felt the lack of several libraries. The most obvious of these is that we had to build a Unicode string library ourselves (built on top of the very impressive OCaml Unicode libraries built by Daniel Bünzli), but we needed many more libraries than that.

The lack of an SDK for Google Cloud has affected us greatly. When you're searching for product-market fit, you do the simplest, easiest thing. If you lack a good SDK for your cloud provider, the simplest, easiest thing is often a terrible architectural choice. We've built our own queue on top of our database rather than using the production-quality cloud queues available on GCP. Similarly, we barely use the Cloud Storage (GCP's version of S3), because we initially put things in the database because it was easier. We've built 3 services, 2 in Rust, and 1 in Go, to work around the challenges we've faced.

The biggest challenge here is our use of Postgres. Postgres is a great database and we're big fans, but Cloud SQL is not a great hosted database. GCP's position is that Cloud SQL is there to tick a box and we should be using Cloud Spanner. I would love to switch to Cloud Spanner, but we have no driver for it in OCaml. Given the Postgres driver in OCaml is not particularly mature, it's hard to expect that a Cloud Spanner driver would exist, and indeed it doesn't. We've had to contribute to the OCaml Postgres driver, and some parts of our codebase have been well and truly mangled when working around features not supported in that driver.

We've also suffered from a lack of a high-level, production web stack (there are low-level stacks with good reputations that I've struggled to use, and a few new ones out there that look good), in particular lacking a user authentication module. We've been using Auth0 to work around this for now, which has more moving pieces than I'd like, and a shockingly high cost (our 7000 users, most of whom never log in, costs us over $500/mo).

We've worked around other missing vendor SDKs by calling their HTTP endpoints directly and that's been mostly fine. However, for libraries like encryption we don't have that option - we hacked around a missing encryption library, but decided not to ship it to production until we audited it for security (which was never actually worth the cost).

At CircleCI, we bet on Clojure. That was also a non-mainstream language, but its ability to call Java SDKs meant we had a mature cloud library, which was essential for building CircleCI. Of course, in OCaml we could call C libraries (and even Rust libraries, perhaps), but it doesn't match having native libraries we can call directly.

Learnability

I'm mostly in the camp that anyone can learn any language, but I saw a team struggle with OCaml, and for good reason. Language tutorials are extremely poor in OCaml compared to other languages; they're mostly lecture notes from academic courses.

The compiler isn't particularly helpful, certainly compared to Rust or Elm (both of which have been in our stack at one point). Often it gives no information about an error. Syntax errors typically say "Syntax error"; though it will try to give a good error for a mismatched brace, often incorrectly. Type errors can be a real burden to read, even after 3 years of experience with it.

The docs in OCaml are often challenging to find. The Jane Street docs have improved significantly in the last few years, but it can be a challenge to even figure out what functions are available in a particular module for most libraries. Compare to the excellent docs.rs in Rust, which has comprehensive API docs for every package in Rust.

One of the ways I personally struggled in OCaml is around Lwt. Lwt is (one of!) OCaml's async implementations. I couldn't figure it out several years ago and so just built a single-threaded server. The amount of workarounds and downtime we've suffered from that single decision is immense. A tutorial around building high-performance (or even medium performance!) web servers would be very valuable.

Tooling

Tooling is something I read would be good in OCaml. I remember reading there was a debugger that could go back in time! I don't know where that's gone but I've never heard of anyone using it.

We have struggled to make editor tooling work for us. This is partially because we also use ReasonML and this seems to break things. Unfortunately, this is common in programming, but even more so in small communities: you might be the first person to ever try to use a particular configuration.

Finally, the disconnect between the various tools is immense. You need to understand Opam, Dune, and Esy, to be able to get something working (you could also do it without Esy and just rely on Opam, but that's much worse). I talked about a bunch of these challenges here.

Language problems

Multicore is coming Any Day Now™️, and while this wasn't a huge deal for us, it was annoying.

Minor annoyances

One of my biggest annoyances was how often OCaml folks talk about Fancy Type System problems, instead of how to actually build products and applications. In other communities for similar languages (ReasonML, Elm, F#), people talk about building apps and solving their problems. In OCaml, it feels like people spend an awful lot of time discussing Functors. It's not quite at the level that I perceived in the Haskell world, but it pointed out that the people building the core of the ecosystem do not have the same problems that I do (which is building web-y stuff).

Was OCaml the wrong choice?

I honestly think OCaml was a great choice at the start. Being able to quickly and safely make large-scale changes to your app is something that staticly-typed functional languages excel at. I'm happy that we made the choice, and in retrospect, it still seems like the best choice of those we had at the time.

What's next?

I'm working on building the next version of the backend. We have about 20k lines to be replaced, and they'll be rewritten in a new language while keeping the semantics the same. I plan to leave keep the frontend in ReasonML: it doesn't suffer from the same library problems as it can interface nicely to JS, and it's nearly 50k lines of code so it would be a much bigger undertaking.

Read the followup to see what we picked!


You can sign up for Dark here. For more info on Dark, follow our RSS, follow us (or me) on Twitter, join our Slack Community, watch our GitHub repo, or join our mailing list.