Optimizing by 170,000%(!) by not being silly

I care about the performance of RavenDB. Enough that I would go to epic lengths to fix them. Here I use “epic” both in terms of the Agile meaning of multi-month journeys and the actual amount of work required. See my recent posts about RavenDB 7.1 I/O work. There hasn’t been a single release in the past 15 years that didn’t improve the performance of RavenDB in some way. We have an entire team whose sole task is to find bottlenecks and fix them, to the point where assembly language is a high-level concept at times (yes, we design some pieces of RavenDB with CPU microcode for performance).When we ran into this issue, I was… quite surprised, to say the least. The problem was that whenever we serialized a document in RavenDB, we would compile some LINQ expressions. That is expensive, and utterly wasteful. That is the sort of thing that we should never do, especially since there was no actual need for it.Here is the essence of this fix:We ran a performance test on the before & after versions, just to know what kind of performance we left on the table.Before (ms)After (ms)33,78220The fixed version is 1,689 times faster, if you can believe that. So here is a fix that is both great to have and quite annoying. We focused so much effort on optimizing the server, and yet we missed something that obvious? How can that be?Well, the answer is that this isn’t an actual benchmark. The problem is that this code is invoked per instance created instead of globally, and it is created once per thread. In any situation where the number of threads is more or less fixed (most production scenarios, where you’ll be using a thread pool, as well as in most benchmarks), you are never going to see this problem.It is when you have threads dying and being created (such as when you handle spikes) that you’ll run into this issue. Make no mistake, it is an actual issue. When your load spikes, the thread pool will issue new threads, and they will consume a lot of CPU initially for absolutely no reason. In short, we managed to miss this entirely (the code dates to 2017!) for a long time. It never appeared in any benchmark. The fix itself is trivial, of course, and we are unlikely to be able to show any real benefits from it in a benchmark, but that is yet another step in making RavenDB better.

Mar 20, 2025 - 13:52

I care about the performance of RavenDB. Enough that I would go to epic lengths to fix them. Here I use “epic” both in terms of the Agile meaning of multi-month journeys and the actual amount of work required. See my recent posts about RavenDB 7.1 I/O work.

There hasn’t been a single release in the past 15 years that didn’t improve the performance of RavenDB in some way. We have an entire team whose sole task is to find bottlenecks and fix them, to the point where assembly language is a high-level concept at times (yes, we design some pieces of RavenDB with CPU microcode for performance).

When we ran into this issue, I was… quite surprised, to say the least. The problem was that whenever we serialized a document in RavenDB, we would compile some LINQ expressions.

That is expensive, and utterly wasteful. That is the sort of thing that we should never do, especially since there was no actual need for it.

Here is the essence of this fix:

We ran a performance test on the before & after versions, just to know what kind of performance we left on the table.

Before (ms)	After (ms)
33,782	20

The fixed version is 1,689 times faster, if you can believe that.

So here is a fix that is both great to have and quite annoying. We focused so much effort on optimizing the server, and yet we missed something that obvious? How can that be?

Well, the answer is that this isn’t an actual benchmark. The problem is that this code is invoked per instance created instead of globally, and it is created once per thread. In any situation where the number of threads is more or less fixed (most production scenarios, where you’ll be using a thread pool, as well as in most benchmarks), you are never going to see this problem.

It is when you have threads dying and being created (such as when you handle spikes) that you’ll run into this issue. Make no mistake, it is an actual issue. When your load spikes, the thread pool will issue new threads, and they will consume a lot of CPU initially for absolutely no reason.

In short, we managed to miss this entirely (the code dates to 2017!) for a long time. It never appeared in any benchmark. The fix itself is trivial, of course, and we are unlikely to be able to show any real benefits from it in a benchmark, but that is yet another step in making RavenDB better.