Published on February 11, 2025

See you later, Allocator

I warned you about the puns. Those, and anime references, are not going to stop. You've been warned. Again.

Anyhow, time for an allocator, and a utility library to go along with it.

Why?

Why create a custom allocator at all? In a word: Performance.

The memory that a C++ program can access is divided into two parts:

The stack, whose size the compiler can determine at compile time. This means that the operating system can carve out a segment of that size for my program when it starts, and that's that.

Most "regular variables" live on the stack, i.e.
```
int i_live_on_the_stack = 1;
char me_too = 'W';
int and_all_of_us_as_well[4] = {1, 2, 3, 4};
```
But there's two drawbacks:
1. The stack's size is limited. Usually it's pretty small!
  
  Especially if my program has recursive function calls, or just long chains of functions calling functions calling other functions, and each of those functions uses a lot of stack memory, my program can run out of stack memory altogether (which is bad, and crashes your program).
2. I don't know the size of some things at compile time.
  
  For example, if a game has a hornet's nest that spawns a new hornet every second, then how should I know how many there are going to be? Maybe the player is really bad at killing hornets, but has a ton of health potions, so they end up with a thousand hornets chasing them! So, where do we store our hornets?
The heap, which we can grow at runtime!

This is also called "dynamic memory allocation", and is much more flexible.

C++ types like std::vector or std::string use this to enable themselves to grow (and shrink) as needed.

But it comes at a cost: To get memory from the heap, we must make a System Call. That means handing over control to the operating system. And that means being completely at the mercy of the operating system to get that control back! If the OS is busy, or decides to be mean to me personally¹, then I can kiss a consistent frame rate goodbye :(

Another problem is locality. There's no guarantee that the chunks of memory that the operating system hands out are anywhere near each other. But computers go (much!) faster when they work on memory locations that are near each other, thanks to CPU caching.

So, if I want to go fast, I should reduce the number of heap allocations as much as possible. With a custom allocator, I can do that! My custom allocator can allocate a big chunk of memory when the program starts, and hand out pieces of that to things that want dynamic memory as needed. I could even use different big chunks for things that I know I'll use together, so that I can take advantage of locality.

But implementing such an Arena (or Region) Allocator can wait until I actually need that performance. For now, I just want to make sure I can slip it in when I do!

How?

This isn't going to be very interesting, so I'll start by talking a bit about what I'm not going to do: I'm not going to wholesale replace malloc, free, new and delete². Not only because it's confusing and unexpected for casual readers (it's very not obvious when code is using a custom version of malloc and friends), but also because I want to be able to decide on a whim which allocator to use.

Instead, I'm going to take a page out of Zig's playbook and pass an Allocator to any function that needs one. This has a few nice benefits:

It's obvious whether or not a function dynamically allocates memory (at least once you know the convention of only using a custom allocator).
I can use a special, instrumented, allocator in tests to check on the amount of memory that functions use.
Everything I'm doing works nicely with C++'s paradigms, rather than having to stoop to writing C-style code.

Allocator Interface

So, what does the interface look like? As simple as I can get away with! Let's go with this for a start:

class Allocator {
 public:
  virtual ~Allocator() = default;
  // Returns a pointers to `size` bytes of memory.
  virtual void* Alloc(size_t size) = 0;
  // Frees the memory that `ptr` points to.
  //
  // Returns false when it can't free the memory (like when `ptr` is
  // not a pointer obtained via `Alloc()` from this same allocator.
  virtual bool Free(void* ptr) = 0;
};

Simple enough.

TestAllocator

I mentioned dependency injection in the intro. Here's a great example: If I build a lightly instrumented Allocator for testing, then I can validate how much memory all of my functions use. Here we go:

class TestAllocator : public Allocator {
public:
  // Keeps track of the allocated memory segments.
  // Use this for test assertions.
  std::unordered_map<void*, size_t> allocated_pointers;

  // Uses base `malloc()` to get `size` bytes of memory,
  // Saves the pointer and size to `allocated_pointers`
  // and returns the pointer.
  void* Alloc(size_t size) override {
    void* ptr = malloc(size);
    allocated_pointers[ptr] = size;
    return ptr;
  }

  // `free()`s `ptr`, if and only if `ptr` was obtained
  // via `Alloc()` above.
  bool Free(void* ptr) override {
    auto it = allocated_pointers.find(ptr);
    if (it == allocated_pointers.end()) {
      return false;
    }
    free(ptr);
    allocated_pointers.erase(ptr);
    return true;
  }
};

Again, nothing wild here.

c++20 Allocator Adaptor

Now here comes the fun part. There are some very handy parts of the C++ standard library that rely on dynamically allocating memory. It would be nice if I could use those with my custom allocator, right?

Turns out we can! All I need to do is implement the functions and define the types that std::allocator_traits expects. Now, I'm not really qualified to tell you about how traits work, but they're... templates that are sort of a way to get some of the same benefits of polymorphism without inheritance (and the associated runtime overheads), or of concepts without... well, concepts. std::allocator_traits in particular works with any class that has a few specific types and methods, and allows things like STL containers to use those types and methods without knowing anything about the actual allocator class template.

Luckily, the set of required things for std::allocator_traits is quite small. Given an allocator template class MyAlloc<T>, they are:

MyAlloc<T>::value_type (must be T)
T* MyAlloc<T>::allocate(size_t n)
MyAlloc<T>::deallocate(T* size_t n) noexcept
Equality operators
Copy and move constructors/assignment operators, supporting MyAlloc<V> (i.e. allocator instances with a different template type)

Since MyAlloc<T> must be a template (so that one can instantiate it for different types T), I can't build that functionality into TestAllocator (or the Allocator interface class) itself.

But I can make an adaptor template! Which is probably for the best, since the fact that this adaptor holds a pointer to the Allocator itself makes copy and move much simpler.

template<typename T>
class AllocatorAdaptor {

private:
  Allocator* alloc_ = nullptr;

public:
  using value_type = T;

  T* allocate(size_t n) {
    return reinterpret_cast<T*>(alloc_->Alloc(n));
  }
  void deallocate(T* ptr, size_t n) noexcept {
    alloc_->Free(ptr);;
  }

  template<typename V>
  bool operator==(const AllocatorAdaptor<V>& other) noexcept {
    return *(this->alloc_) == *(other.alloc_);
  }

  template<typename V>
  bool operator!=(const AllocatorAdaptor<V>& other) noexcept {
    return !(*this == other);
  }

  AllocatorAdaptor(Allocator& a) noexcept : alloc_(&a) {}
  
  template<typename V>
  AllocatorAdaptor(AllocatorAdaptor<V>& other) noexcept : alloc_(other.alloc_) {}

  template<typename V>
  AllocatorAdaptor(AllocatorAdaptor<V>&& other) noexcept {
    std::swap(alloc_, other.alloc_);
  }

  template<typename V>
  AllocatorAdaptor& operator=(AllocatorAdaptor<V>& other) noexcept {
    alloc_ = other.alloc_;
    return *this;
  }

  template<typename V>
  AllocatorAdaptor& operator=(AllocatorAdaptor<V>&& other) noexcept {
    std::swap(alloc_, other.alloc_);
    return *this;
  }
};

Easy peasy, if a bit wordy.

Note

Since AllocatorAdaptor holds a pointer to an Allocator, and can be copied, I should probably make Alloc() and Free() thread safe when I implement the Allocator interface for real. But surely I can get away with just being very careful, and/or using only a single thread¹.

No, this isn't foreshadowing². ↩

Yet. ↩

MakeUnique() helper

One small annoyance: std::make_unique() doesn't work with custom allocators. There's a variation on std::make_shared() (std::allocate_shared()) that does, but I don't want shared_ptr almost ever.

No big deal, we'll just make our own (probably slightly less robust):

struct AllocatorDeleter {
  Allocator* alloc = nullptr;
  void operator()(void* ptr){
    if (alloc == nullptr) {
      return;
    }
    assert(alloc->Free(ptr));
  }
};

template <class T>
std::unique_ptr<T, AllocatorDeleter> MakeUnique(Allocator& a) {
  return std::unique_ptr<T, AllocatorDeleter>{
    reinterpret_cast<T*>(a.Alloc(sizeof(T))),
    AllocatorDeleter{&a},
  };
}

assert()ing that Free() succeeds may seem a bit drastic, but it only does anything in debug builds, and I really want to catch situations where Free() fails.

Wrapping up

And that's it! Memory management solved³.