Darklang year in review - 2021

Darklang year in review - 2021
The Scribe by George Cattermole

We just slipped into March, so this is as good a time as any to review what happened in Darklang in 2021. While the bulk of this post is technical details about the rewrite, I have included some company details at the bottom too. Enjoy!

For context, Darklang is an integrated programming language, editor, and cloud infrastructure, making it trivial to program APIs and build backend applications. Dark hit a bit of a tough patch in 2020, reducing the company to just me. In 2021 I decided to focus on a much needed backend rewrite to create a strong technical platform for the future.


Pretty much the only activity across 2021 was rewriting the Darklang backend. I started writing it in late 2020, and have been working on it (as of March 1) for 14 months.

A Dark app, showing the string "hello on the new F# implementation of darklang"

We discussed this a little in the last community meetup, but it's now basically done! All the testing, logging, hardening, fuzzing, devops, deployment, etc, is all done, coming in at about 68,000 lines of code. There remains a mere handful of tasks and final checks - the last major blocker was a security review, which just gave us the all-clear! I put the first piece of new code into service at the end of 2020, and I'm in the process of putting the rest into service to do final tests.

The rewrite, apart from the amount of time it took, has gone pretty well. While it was a pretty straight port from OCaml to F#, I took the opportunity to redo a number of components "right", and some of them worked out very nicely.

Backward compatibility

The new version is aimed to be bug-for-bug compatible with the old one, and this took a lot of time in the rewrite. The testing process (described below) discovered many, many behaviours that we would classify as bugs, and we decided to replicated the bugs to ensure that no programs break. Once we've fully switched over to the new version, we'll deprecate the buggy old functions and add new versions that have nicer behaviour.

We managed to keep the changes between the versions extremely small, and spent some time deciding on what changes we would allow, which we added as a backwards compatability section in the docs.

While we decided that having the exact same JSON formatting wasn't necessary, in some cases we used our JSON formatters to generate signatures for JWTs — if we changed the formatting, customer signatures would be changed. For those JSON formatters, we wrote byte-for-byte compatible replacements in the new version.

Some other changes were imposed by our choice of underlying HTTP clients. The new code uses the .NET builtin HttpClient, while the old code was built around Curl. Both are extremely capably Http clients obviously, but they have different default behaviour that wasn't always possible to replicate. For example, the .NET client will not allow invalid Content-type headers, while curl will allow anything.

All of the edge cases are documented in the Changelog.


Dark is now really well tested. The OCaml version of the backend has a total of 250 tests–the F# one has over 10,000.

In the old version of Dark, creating tests was a real pain. For example, here is how we used to test addition:

let t_int_add_works () =
  check_dval "int_add" (Dval.dint 8) (exec_ast (binop "+" (int 5) (int 3)))

To be able to ensure that the new and old versions of Darklang worked the same, we needed to add thousands of tests, so it was important to make it incredibly easy to write tests. For the test above, we should instead be writing tests like:

5 + 3 = 8

However, Darklang doesn't have a parser, so writing tests like this wasn't trivial. Programs are stored as ASTs, which are sort-of DOM-like way of representing and editing source code in compilers. So to add tests, I repurposed the F# parser to create Dark programs. The F# parser is accessible as a library, and the F# AST is public and well-documented.

The translation layer was straightforward. Since F# is (mostly) a superset of Dark, most Dark programs can be written in F#. I also added some metadata the tests to represent things that aren't in F#, such as Datastores, or the ErrorRail.

Once tests became easy to add, suddenly we were adding dozens of them. Here's a selection of tests of the if-statement:

(if true then "correct" else 0) = "correct"
(if false then "" else "correct") = "correct"
(if null then "" else "correct") = "correct"
(if Test.typeError_v0 "msg" then "" else "") = Test.typeError_v0 "Expected boolean, got error"
(if List.head_v1_ster [] then "" else "") = Test.errorRailNothing_v0
(if blank then "" else "") = blank
(if 5 then "correct" else "") = "correct"

You can see we added a Test library to create values that were hard to create via the language, and we added the _ster suffix to put values on the ErrorRail, and the blank literal to represent unfilled-in values in the program.

To test that the new version and the old version were bug-for-bug compatible, we  ran all tests are run both against both the old and new versions of Dark, ensuring the outcome is the same. All functions and language features are now tested with this.

This worked really well for functions and language features, but it wasn't a good fit for testing Dark's HttpClient or Http framework, both of which needed a server-side component to test against.

We instead built a special test framework for testing HttpClients, and tests look as follows:

Accept: */*
Accept-Encoding: deflate, gzip, br
Content-Type: text/plain; charset=utf-8
Host: HOST

HTTP/1.1 200 OK
Date: xxx, xx xxx xxxx xx:xx:xx xxx
Content-type: text/plain; charset=utf-8
Content-Length: LENGTH

"Hello back"

(let response = HttpClient.get_v5_ster "http://URL" {} {} in
 let respHeaders = response.headers |> Dict.remove_v0 "Date" in
 Dict.set_v0 response "headers" respHeaders) =
   { body = "\"Hello back\""
     code = 200
     error = ""
     headers =
        ``Content-Length`` = "LENGTH"
        ``Content-Type`` = "text/plain; charset=utf-8"
        ``HTTP/1.1 200 OK`` = ""
        Server = "Kestrel"
     raw = "\"Hello back\""}

The [test] code in the bottom is run, and requests to URL reach a server which tests the request exactly matches [expected-request]. If it does, [response] is returned and the [test] code continues.

We actually have 6 different versions of the HTTP clients (1 current version and 5 deprecated functions), and so each of our tests is actually duplicated 6 times and tweaked appropriately for the client in question.

We did a similar thing for the Http framework, which I won't show you, but it again allowed us check that we support all the nooks and crannies of the Darklang HTTP framework.


While most tests were hand-written for edge cases we found as we ported the code, we also invested a lot of time into creating and using Fuzzers. A "Fuzzer" automatically generates inputs according to some description, and tests to see if it can violate some property. (People who are into testing often call this a "property test", and the functional language community will often call this QuickCheck. The compiler, parser and protocol folks call them fuzzers, so that's the name I went with here).

Lots of different things in Dark can be tested quick effectively with fuzzers. The most obvious one is calling built-in functions: since we have a reference implementation of Dark (the implementation in OCaml), we can generate programs and see if the new and old versions give the same result. This found lots of hundreds of hairy edge-cases–in many cases it found that the old implementation was horribly buggy. All our fuzztests are available here.

Cleaning up the Http framework

In the old version, our Http stack was a mess. We had only one server to handle both API requests and customer traffic (programs written in Dark), and often code written for one of them accidentally affected the other. In the new version, we pulled apart the ApiServer (the server that loads our editor and provides APIs for it) from the BwdServer (the server that all of our customer traffic runs on, usually at builtwithdark.com, hence the name), greatly simplifying the code for both.

Dark has a Http framework that has grown over the years. Part of my goal for the rewrite was to rewrite the Dark Http framework as middleware made up of functions in Dark. Sadly, this proved too hard — many of the functions we needed for this don't exist in Dark yet, and trying to support a bug-for-bug compatible version didn't seem like it was going to be possible without adding a lot of functionality. That would have blown up the scope of the rewrite, and so I nixed it.

However, I did manage to pull most of the middleware out into its own module, so we got something from the exercise at least.


I was astonished at the amount of time in the rewrite that was spent on DevOps. I probably spent about 30% of the time on various devops: containers, healthchecks, telemetry, logging, kubernetes, firewalls, exception tracking, DB connections, and probably some other stuff I can't think of at this moment. I even wrote a custom deployment tool called shipit to abstract the hundreds of lines of bash I was duplicating. For example, here is a shipit.yaml file for the ApiServer's Kubernetes deployment:

      - apiserver-network-policy.yaml

    # Manually deployed so it can be used to override
        text-file: nginx-override.conf

    # If you change the nginx override, it needs a rolling restart
      - kubectl rollout restart deployment apiserver-deployment

    config-template: apiserver-deployment.template.yaml
        text-file: ../../containers/ocaml-nginx/base-nginx.conf
        text-file: nginx.conf
        env-file: ../../config/gke-builtwithdark
      - gcp-fsharp-apiserver
      - legacyserver

This definitely has me thinking about how we can move off Kubernetes (looking at Google Cloud Run as an alternative).


One of the goals of the rewrite was to switch to a completely Asynchronous implementation of Dark. I'm glad to say this went well. I started the rewrite just before the release of .NET5, and some parts of F# had not yet caught up with the new Task-based way of doing Async in F#. Fortunately, .NET5 and .NET6 were released, each of which included significant upgrades to tasks in F#, including native support in F# 6. Ironically, we don't actually use them within the interpreter because Ply tasks are faster, but we hope to get there one day and have the best of both worlds.

Now, every part of Dark is fully asynchronous. Since the old version of Dark was synchronous, requests would sometimes get caught behind a long running request. I expect the P95 latency will be significantly improved as a result.

Deprecated functions

One of the features of Dark is that we don't change things. Instead of a new release that requires users to upgrade to it, we release new versions of individual functions, and allow users to upgrade from, say, JSON::parse_v0 to JSON::parse_v1.

While this is something I really believe in, as languages go, I hadn't anticipated the amount of work this would cause in a rewrite. Not only do we have to rewrite all the standard library functions, but we also have to rewrite all the standard library functions that have ever existed.

That was annoying and frustrating, and certainly making me rethink how this plays out in the long run - though, it's not like we're going to do another rewrite anytime soon.

New IR

Typically, compilers have different "intermediate representations", a high-level AST representing the syntax of the program, then multiple different levels of language, each simpler and closer to machine code than the previous. Being a creative sort, these are traditionally called "HIR", "MIR", and "LIR", for high-level, medium, and low-level intermediate representations.

Dark only had one in the old days, and with the rewrite, now we have two. The first, retroactively named ProgramTypes, represents how the program is stored. The new IR is called RuntimeTypes, and more closely aligns with what the interpreter wants to see. Some differences:

  • we flatten out pipes in the RuntimeTypes, for example 4 |> add 1 |> sub 2 becomes sub 2 (add 1 4)
  • ProgramTypes cares about how to represent the program textually, so its float definition is made up of a string representing the whole number part, a string representing the fractional part, and a bool with the sign. RuntimeTypes simply uses a float
  • ProgramTypes has 3 kinds of partial, a left-partial, right-partial, and regular partial. These are used to display what the user is typing right now, before it gets turned into an expression. RuntimeTypes doesn't care at all, and has just one type of partial, which just wraps the expression it evaluates to.

Blazor and js_of_ocaml

Part of the magic of Dark is that the interpreter is available in the editor to tell you what an expression does, for example:

A Darklang HTTP handler, with the cursor on the "accept" header field of a HTTP request. You can see the actual runtime value of this expression below.

In the old version, this was accomplished by compiling the OCaml code with js_of_ocaml. While it required some magic to get it to compile, doesn't support functions that use C libraries, and is hard to debug, it has worked out for us pretty well.

In the new version, we compile the F# code to Blazor. This is probably the biggest hurdle so far - we still haven't worked out all the kinks here to make Blazor run as quickly as js_of_ocaml did. Currently, Blazor is running in interpreted mode, meaning it interprets the Dark interpreter (which is itself interpreting the Dark code). It does have an ahead-of-time compilation mode, but we have yet to make it work. When it does, we should have our code compiled to Wasm, which will allow it to go much faster.

Remaining work

We've basically finished the entire port, and are working on deploying it piecemeal at the moment. The remaining work is tracked in our Project Tracker, and is mostly finishing steps and final validations before release. I had hoped to finish in February, but it has sadly slipped to March. I really hope I'm not saying that again at the end of March.

Overall thoughts

Looking back, this was expected to take 6 months, but optimistically I thought it would take 4 month. I even declared "nearly done" after two months, a classic –  and in retrospect hilarious – bit of hubris. On the other hand, the result is excellent, with masses of tech debt identified, and removed, with much more slated to be cleaned up once the transition is complete. The devops works better, it will be cheaper to run, have fewer services to run, the tests are far far more comprehensive than they used to be, and we switched over to built-in or community libraries for many things we had written by hand.

And of course, instead of 3 languages (Rust, OCaml and Go), there will only be one (F#).

Review: 7/10: Overall, while this took much longer than expected, honestly this could have failed horribly, so I'm extremely pleased with how it has gone.


Hiring did not go well in 2021. I had intended to hire two developers, and worked with some recruiters who are focussed on Latin American and African engineers. Alas, it was a bit of a bust. The main challenge was the DEI requirements - none of the recruiters were able to source any number of women for example, and basically stopped working with us when we insisted that they should.

We did manage to hire somebody in 2021, and we also worked with a number of contractors for some work. Alas, none of these worked out for various reasons. In addition, recruiting was consuming a considerable amount of my time and energy.

On the bright side, it did help identify both what we need from candidates in the future, as well as how we want the dev team to operate (which is to say, largely focussed on an open-source-y model of self-directed work).

Fortunately, just at the start of 2022, I managed to hire Stachu Korick. He's been working on the port with me, and we're unlikely to hire again until the port is done.

In the future, I'll need to have a career page, and also to be clear about the work style at Dark to help everyone be successful.

Review: 1/10 (for 2021). Really did not do a good job of outbound, and got lucky with inbound. I'll probably try again in 2022. I would love recommendations for recruiters who work outside the US (preferably Africa and Latin America), especially if they're good at finding a diverse set of candidates.


Financially, Dark is doing fine. We're generally spending around $30k a month, and I have plans to reduce that by a little bit after the rewrite is done. We have over $800k in the bank, and have enough money to get us to 2024. should have enough money to find our way to product market fit. 2022 or 2023 will be a time to start thinking about revenue, but not until the product is substantially better, so we're good for now.

Review: 5/10: really I wanted to spend a little more to get things done a little faster, but a longer runway is helpful too.

Users / Product-Market Fit

The goal of Dark right now is to get to product market fit - that is to say, to make Dark work well for our users.

As expected, we didn't really get many more users in 2021. There was a conscious decision not to focus on this until the backend rewrite was done, when we would be in more of a position to respond to their needs. We have a handful of people trickling in each day, which is plenty for right now.

Review: 3/10: while not unexpected, also not great.


Overall, I'm pretty happy with how 2021 went. We've gotten into a bit of a groove with working on the backend, have completed a lot of work on reducing our technical debt, and generally I've a lot more optimism at the moment than I had going in to 2021.

If you're interested in getting involved with the development of Darklang, come say hi on Slack or follow us on Twitter or Twitch.

Discuss on Twitter or Hacker News.

Thanks to Bhupesh Varshney for reading early versions of this post.