Darklang is going all-in on AI

Like an aging rock star making a final stab at glory, I'm delighted to announce that Darklang is going all in on AI/GPT.

As everyone knows, the folks over at OpenAI produced a magic box that writes code. And it even produces quite good code – not perfect, not by a long shot – but much better than you'd reasonably expect, and improving quickly.

Do we expect that in 3-5 years time developers will still be typing out artisanal code at a keyboard? I struggle to see it. With recent announcements like ChatGPT Plugins and Copilot X, even three years feel generous. After using AI-based code generation for some time now, and seeing how fast it's evolving, I feel how code is written is in the process of fundamentally changing.

This means big changes for Darklang. On February 1, we stopped working on what we're now calling "darklang-classic", and are fully heads down on building "darklang-gpt", which is the same core Darklang but redesigned to have AI as the primary (or possibly only) way of writing code.

To follow along, you can follow our GitHub repo, join our Discord, our beta waitlist, or follow me on Twitch. And if you like what we're doing, sponsor us!

Making Darklang good for AI codegen

If you're wondering what we're doing, I'll tell you honestly: I don't know. The magic boxes are completely changing what it means to code, and I would hesitate to venture any confidence on how this looks when we're done.

The vision though remains the same: we want to make it 100x easier to write cloud backends. We still have the goal of enabling large-scale cloud backends for developer teams, though plausibly we might take a circuitous approach to get there.

The initial goal is to find an AI-based codegen approach that's better than existing alternatives, assuming the state of the art here is writing Python or Typescript in VSCode using Copilot. Based on the environment that AI-based codegen is coming into, I have some thoughts on how it may be possible to differentiate.

Holistic

Darklang's major advantage remains the same: the holistic or integrated nature of the platform, language, editor, and infrastructure. Darklang's major features: Deployless, Invisible Infrastructure, and Trace-Driven Development, were all difficult before Dark, because they occurred at the intersection points of the editor, language and infra. By removing that intersection in Darklang, these features were very obvious and fell out fairly naturally.

What are the intersection points in AI-generated code? Well, look at what Copilot is building: Copilot Chat in the IDE, Copilot for Pull Requests, Copilot for Docs, and Copilot-cli. Super exciting, but those seem like 4 different products, meaning a lot of intersection points with existing workflows, code bases and production deployments which we have the potential to remove.

Codegen is different from text editing

It's amusing to me that we've primarily been using AI-generated code via Copilot in our IDEs – which are, you know, "text" "editing" tools. These have UXes that were designed for, well, editing text. Are they suitable for code generation? Sure, it's just text, it does anything. But is this the best design? Or put another way, is there a UX paradigm for code generation that's 2-3x better than a text editor? Probably.

One obvious thing is that the AI has to figure out where generated code goes in the codebase, whether creating or replacing it. Inserting code in a codebase is the sort of thing you can do very easily or very well, but usually not both.

We speculate that this will be easier to do – and potentially better – in Darklang, due to our holistic nature, compared to what is needed in the general case of supporting all programming languages. That said, GitHub has powerful powerful parsing and static analysis tools at its disposal, so it might actually be able to have both.

Similarly, since the AI can't have the entire codebase in its head (at least not today), any AI-based codegen needs to figure out things like appropriate context. Steve Yegge came out of essay retirement to explain what a nightmare figuring out context is in the general case. Again, this might be easier in Darklang – we'll see.

Deployless

Of course, the main thing people love about Darklang is the Deployless feature. You write code, it goes straight into production (safely of course). Combine Deployless with AI and now we generate code straight into production (safely of course - though what that means has yet to be discovered). That skips significant steps and gets us back to the coding part much faster.

Now, today's AI can already help us deploy code and services, because it can already generate Terraform or SQL migrations or Kubernetes manifests. Those, however, are areas where it's hard to feed appropriate context into the AI (I'm imagining people dumping snapshots of Postgres locks into their GPT prompts), and also ones where you would necessarily be a little sketched out about running the commands that the AI generates.

Deployless also provides the potential to feed back lots of information into the AI. We've seen that the AI enjoys feedback and can make improvements based on errors it's given. So being able to immediately say "we ran it and got this output" is powerful, and especially given we could do that automatically using Darklang's Trace-Driven Development. That would give us an end-to-end AI experience that perhaps won't exist elsewhere.

And of course, the advantages of Deployless remain: today when the AI completes the task in 5 minutes, you'll still have to spend three hours shepherding the code out into production. You could have written 20 AI-generated features in that time. Removing the deployment step entirely from code remains a massive deal.

Language features

As of today, no programming language is particularly tuned to enable AI-generated code, most languages having been written for a text-editing world. What potential is there here?

Some obvious existing features that will help the AI are static typing and immutability. Static typing can provide excellent feedback that generated code will work, and the better the static typing, the better the feedback. I expect for example that you'd get much better generated code from Typescript than from Python, as Typescript's is a bit stronger than MyPy.

Very strongly statically-typed languages like Haskell, F#, Rust, and of course Darklang – languages where people routinely observe "if it compiles, it works" – will presumably do really well when errors are fed-back to the AI. I suspect this may even lead to a resurgence of weird esoteric type systems (in a manner similar to the rise of Rust), as now it's just AIs that have to deal with their bullshit, not humans.

Immutability is another likely win: AIs do much better on code directly in front of them than on far-away state being modified indirectly (such as via OO). Honestly, so do humans. Immutable languages with powerful primitive types (lists, maps and records), might allow the AI to generate significantly better code as the computation will have much higher locality. This is very much speculation, obviously; I really want this to be true, due to my heavy bias towards functional programming. Again, we'll see what happens.

Complexity

Overall, lower complexity should get better outcomes with AI. If we manage to remove steps (such as deployment or using Darklang's "Invisible Infrastructure"), that's better than having to clog the AI's head with boilerplate. It's also better for users: while the AI will happily paper over complexity by generating boilerplate, the human still has to read and verify it. It would simply be better for everyone if the boilerplate didn't have to exist.

Inspiration for our work

LangChain (& ChatGPT Plugins)

LangChain is one of the most interesting ways that people have been building on AI – hooking the AI up to other tools to validate generated code, provide more context, access other services or data, etc, enabling it to iterate itself automatically to a better solution.

Darklang has always been about connecting different services together, and so building this sort of tooling is an obvious starting point as we build Darklang's AI integration. "DarkLangChain" anyone? This was also one of the use cases suggested by the more excitable users that we spoke to.

In particular, allowing users to extremely easily package up parts of their chains to be reused by others using Darklang's built-in package manager is going to be really cool.

AI remixing

One of the biggest inspirations in what we can accomplish from AI is Ken Van Haren's post about building an analyst bot. Ken provided his table schema to the AI, then asked it to provide SQL statements to answer them (with all sorts of LangChain steps along the way to reformat data, visualize results, etc).

This sort of "AI remixing" – exposing your core functionality and data to the AI to be remixed by users – is a fundamentally better and easier-to-use approach than the old way to do this: creating and using APIs.

I suspect this is going to have a huge impact on how software is written going forward, especially for companies building internal tools. If you can build a dashboard for your boss in two weeks, or strap an AI to a database in an hour instead, it's a fairly obvious choice. There's also a massive difference between a service whose API allows six specific queries, and one that can do arbitrary queries specified by the user in natural language.

If it can be made safe of course, which is a big "if", though looking at ChatGPT Plugins, which is a sort of cousin to this approach, lends credability to it.

UX for generated code

Cursor is an IDE for generating code with AI – even their very early demos are compelling at showing how the UX of writing software can change. It's much less concerned with highlighting matching parentheses and far more focused on providing tools to allow users validate whether they approve of the code coming out of the AI.

This feels a lot more "AI-native" than Copilot's initial AI integration, which would probably be better described as "text-editor native". The UX also seems much better than the Copilot X demos, in my opinion. I would guess we are going to see a lot of UX experimentation outside the space of text buffers and source files, as text editing becomes much less central to the act of writing code.

Of course, it's an open question whether developers actually want this.

What's the plan?

We did some early validating of the direction in January, and since Feb 1st, we've been rushing as quickly as we can to move ourselves over to AI-generated codegen.

Continuity for existing users

Supporting existing users is extremely important to us. Existing users want a stable environment, not one where we're doing unhinged experiments with AI. As such, we've decided to move "darklang-classic" (the backend behind darklang.com and builtwithdark.com) to a new repo, and to only do security updates on it. Once we've figured out how our new AI-based platform works, we'll look at porting users over.

Killing the editor

As you might know, in Darklang-classic, you wrote code using a "structured editor". This is a non-freeform editing experience that our users have rated somewhere between "Ok I guess" and "probably the worst part of Darklang".

As well as no longer being important in a world of generated code, the old editor's code was pretty awful, and no one was really excited about saving it. While we'll always remember the good times we had with the structured editor, long story short, a few of us took it round back and shot it in the head last month.

Moving the language forward

By removing the editor, and the backward-compatibility requirements, we've unlocked incredible progress on moving the Darklang language forward. We've made huge strides in the type system, removing tech debt, adding language features, fixing inconsistencies and runtime errors, improving error messages, removing newly-unnecessary language features like Blanks (aka typed holes), and supporting a much larger set of Darklang values in our databases and queues.

Overall, we accomplished more in February and March than in the past two years combined. We're tracking this in a lightweight way in our Notion, feel free to poke around.

Building AI-first

It didn't take much work to get ChatGPT to write Darklang code, but we also need a lot of experience using that output. Our focus in the future will be writing almost all the Darklang product in Darklang using AI. Our goal is that >90% of the code we write, including the AI experiments, OpenAI integration, ApiServer, user management, notifications, package manager, DarkLangChain, etc, are all built directly in Darklang by the AI.

This will help us get a good understanding of what the UX needs to be and iterate our way to it. It will also help us understand if the language needs to be modified to better support AI.

Experiments

Before all that happens though, we are going to be running experiments. As I said, we have no idea what this product needs to be. We're back in the pre-product stage of development, with a small, well-integrated team, a tiny little tech moat (Deployless), and a community of people rooting for us (which we very much appreciate!)

Our plan is to run a bunch of experiments, let you try them out, see what people like and what sticks, and use it to figure out what the product needs to be.

How to follow along?

Sponsor Star Watch
  • If you're excited about what we're working on, consider sponsoring us.
  • For the firehose of information, watch or star our repo and join our Discord, check out our tracking docs on Notion, or follow me on Twitch.
  • If you're looking for updates, including announcements about our experiments and progress, join the mailing list and follow our socials.
  • To sign up for the DarklangGPT beta, join the mailing list (and maybe sponsor us to get to the front of the line).
  • To see what's special about Deployless, Invisible Infrastructure, and Trace-Driven Development, try Darklang classic.