Going 15 percent faster with graph-based type-checking (part one)

19 December 2024

Open-source software

By Florian, Open-Source Software Engineer

Our researchers and engineers use a lot of C# and F# in their models and over time, they became increasingly frustrated with the time it took to update and execute code.

Given the potential impact even the smallest improvement here might yield, this was something that we needed to improve. However, given the sheer size and performance requirements of our platforms and tooling, these fixes to expedite coding and execution were not necessarily something that the wider open source software community would help with.

Enter Project Velocity.

Velocity was a three-month internal effort to investigate what could be done to improve the experience of developing and pushing code into production, and I was excited to become a part of this work.

I joined the open source team full time in April 2022 and my first assignment was to improve the .NET developer experience. G-Research is a special environment, with tight security constraints. So, to experience what the developers were experiencing, I had to go to the office in London.

In May 2022, during my first visit, I was introduced to Janusz Wrobel and Chris Arnott. They had been working on Velocity, an internal project to investigate what could be done to improve the developers’ experience.

Our engineers deal with a large solution, consisting of hundreds of instances, with a mix of C# and F#. This made everything slow: from restoring NuGet packages, to building the code, as well as the editor environment.

My colleague Marcin Krystianc was already investigating the NuGet side of things, and had worked on something to make the static graph evaluation better for NuGet, as well as detecting any regressions. This was important work and provided a useful start point from which we could further accelerate the Velocity project.

During the investigation, we discovered that F# code takes a long time to build and does not seem to use a lot of parallelization. When compiling an F# project, it was also apparent that the type-check phase was the most expensive one. This was something I could focus on. Here was a place where I could bring my expertise to bear.

Though I previously wrote about my work creating this feature in the Microsoft Developer Blog, I’ll use this blog series to tell the behind-the-scenes story of how and why the graph-based type-checking feature in the F# compiler came to be, and to show how it is being used at scale to ensure faster processing for G-Research.

It’s the story of how I advocated for important changes to the open source code of F# that would allow us to push the software beyond what previously seemed like its maximum capacity. It’s the story of how open source innovation doesn’t just happen – it sometimes needs to be midwifed-along, pushed through, cajoled into existence, prodded beyond what seems possible – to get real progress.

Person with blue hair typing code on two monitors at a desk in a bright modern open office

Interested in open-source software at G-Research?

Explore our tools and learn more about how we contribute to the open-source ecosystem.

Learn more

Parallelizing previously sequential type-checking

When investigating the F# compiler, it became clear that type-checking happened entirely sequentially. This is due to the very nature of how F# code works. It is the same reason why you can, for example, only have access to a function after it has been defined.

We were not the first ones to think about parallelizing the type-check phase in the compiler. Will Smith had attempted it earlier. His idea was to skip the implementation files when signatures were present. Type-checking would happen in two phases: first go over the entire project, skipping any implementation files in a pair. And in a second phase, process the skipped files in parallel.

As there was some prior art, I decided to take a stab at reviving this and quickly found it was hard stuff to get into. Because I couldn’t fully grasp everything in the type-check phase, I reached out to talk to Don Syme, the creator of F#, to try and have a pairing session. We connected, and it led to the PR being finished and merged just in time for the `7.0.100` release in November 2022.

We already overshot the safe-release window, but Don pushed for it, and it got in anyway.

Having this was a good first step, but it would only really speed up things if you were using signature files. This landing in the Software Development Kit (SDK) wasn’t going to speed up everything for G-Research overnight. It was not going to be a quick, or easy, win.

Having it depend on dealing with signature files also was a hard sell, too. G-Research eventually removed many of the signature files it was using because the engineers didn’t find them to be worth the effort.

Regardless of the outcome, this was a good first step toward embracing parallelism inside the compiler. When doing this kind of open source endeavour, you need to play the long game, be patient, and work through the system.

Determining the graph

Because this first feature landed in the .NET SDK, the question was raised: can we not do this for more files? When reasoning about this, we figured that some files in a project don’t really need to know about each other in order to type-check them.

The unit test scenario is the best way to illustrate this. Two unit test files typically don’t depend on each other and only exist as two files because of a logical split of the production code that is being tested. We came up with the idea to construct a graph of all the files in a project. This graph potentially has multiple branches that can be processed in parallel and thus make the whole type-check phase of the F# compilation faster.

In order to create a graph of the project we need to inspect the files to determine the relationship between them. This can be done after the type-checking phase but that, of course, defeated the purpose. We need to detect this before the start of the type-checking phase. Using the untyped syntax tree came to mind – it would mean processing each file, getting ahold of all the identifiers it exposes and finding out which other files use those identifiers.

Initially, I was quite sceptical that we could pull this off using the untyped tree. My initial fear was that we might miss a link between two files and that this would cause the project not to type-check.

I had some experience with untyped syntax trees because those are also used in the Fantomas project. And being the maintainer of that project, I’m well aware how difficult it is to always process all the F# code correctly. There are always “unknown unknowns,” and it is very hard to say that you covered an entire language. This led me to revive the idea of being explicit about it. I again talked to Don about it and we shared some thoughts.

Having that language change would allow us to define all the file dependencies. But at the cost that every F# file would need to be edited, and well, it would be a large change in the compiler. This wasn’t realistic and I abandoned that idea rather quickly.

Janusz was still determined that the graph could be constructed from the current untyped syntax tree model. Our opposing views on this topic led to work on a fun weekend project where he made a proof of concept.

Graphs, type-checking and Saturday morning coffees

I was impressed by how accurate Janusz’s proof of concept was. It convinced me to start believing in the idea. This, of course, was only the beginning of a long, lonesome road of crafting the end-to-end feature. There were many open questions remaining, yet it became clear this was our best bet. By October 2022, Janusz and I started collaborating closely to work on this idea.

Realistically speaking, we knew that we probably were not going to find all the “unknown unknowns” before releasing this feature into the wild. That would take too long and be too hard. I convinced Janusz that if I rewrote some of his proof-of-concept code, it might be easier to understand what was happening and that users would be able to more easily debug their project if something isn’t working. It was a way for us to release something and rely on early users to find and work through any specific issues they might encounter.

When we were working on this, we discovered one roadblock after the other. Janusz and I had quite a few Saturday morning sessions as well to discuss our progress. The two of us were exploring multiple ideas and/or tackling different problem spaces. So, on Saturday morning we were aligning our puzzle pieces over some virtual coffee. It became the thing that we shared to make what felt like a lonesome software struggle easier. Finding something personal, an opportunity to connect, even when the British channel separated us.

Our PR was shaping up and we made good progress. However, we were in a difficult position: – our PR was getting quite large.

Going 15 percent faster with graph-based type-checking (part two)

Part two details the journey of optimising .NET performance with graph-based type-checking, overcoming open-source challenges, achieving 15% faster builds and reflecting on balancing incremental and transformational change.