Asset loading (Part 1)

Note

Did you know? Ranting about this experience to my friends was what prompted me to set up this entire blog!

Alright, so I've got a build system, an allocator interface, and several folders full of 30-year-old source code to pore over. Sounds like a fun evening!

Like I said in the intro post, the first few chunks of code I'll port are going to be about loading assets.

Here's an overview of the different file formats that wipEout uses, sorted by their filename extension:

ExtensionDescription
.CMPThese are compressed textures, I think. They're loaded with a function called LoadCompressedTextureSequence().
.DPQOnly used in a folder called NEWGRAPH, so maybe these are PC specific1. Used with ReadPCX(), so it seems this and .PCX below are companion formats.
.INF??? – I guess we'll find out eventually.
.PRMProbably 3D models. The game loads them in LoadPrm().
.PCXAlso only used in the NEWGRAPH folder. Maybe textures? LoadVRAM()
.TEXMore textures? Loaded with LoadWtlFiles().
.TIMProbably more Texture IMages. Loaded with LoadTexture() / Load16BitTexture().
.TTFNo, these aren't TrueType Fonts, but rather Texture Template Files.
I don't know what texture templates do just yet, but I do know the game uses LoadTtfFile() to load these.

Slurp

But before I get to the actual loading / decoding of data, I'm going to need some very basic functionality:

Given a path on the file system, and a correctly-sized buffer,

  1. Open that file
  2. Copy its contents, bit-for-bit, into buffer

Some programming languages have this built in and call it slurp. C++ doesn't, and I like the name, so I'll implement slurping myself.

Now, some people will tell you to do that like this:

std::vector<uint8_t> Slurp(std::filesystem::path path) {
  // Create a buffer the same size as our file.
  std::vector<uint8_t> buffer{
      std::filesystem::file_size(path)
  };
  std::ifstream input_stream(path, std::ios::binary);
  // Copy from the file to the buffer, in one-byte increments.
  std::copy(
      std::istreambuf_iterator<uint8_t>(input_buffer),
      std::istreambuf_iterator<uint8_t>(),
      // No std::back_inserter because `buffer` already has
      // the right number of elements.
      buffer.begin());
  return buffer;
}

Looks a bit verbose, but otherwise okay, right?

Wrong.

Let's forget about performance for a second and assume that std::copy() is a very clever abstraction that gives us the highest speed we could possibly achieve.

Even then, there is a problem: If your file happens to have a 0x20 byte, this function will eat that byte and pretend it never existed.

If you know your ASCII tables, this might raise an alarm for you. "Isn't 0x20 a space character?"

Indeed it is, and what's happening here is a feature: C++ input streams skip whitespace by default. More specifically, std::basic_istream::sentry does, and istreambuf_iterator uses that default sentry.

Why am I telling you this, if I know better? Because I only know better after spending an afternoon debugging my .CMP loading code (below), only to eventually find that the Slurp() function was at fault.

Here's a better way to Slurp():

std::vector<uint8_t> Slurp(std::filesystem::path path) {
  // Create a buffer the same size as our file.
  std::vector<uint8_t> buffer{
      std::filesystem::file_size(path)
  };
  std::ifstream input_stream(path, std::ios::binary);
  input_stream.read(
      reinterpret_cast<char*>(buffer.data()),
      buffer.size());
  return buffer;
}

This works because std::basic_istream::read() is an UnformattedInputFunction, which, among other things,

Constructs an object of type basic_istream::sentry with automatic storage duration and with the noskipws argument set to true [...]

(emphasis mine)

Now, I don't really want to use std::vector here, mostly because the parsing code requires me to reinterpret_cast back and forth quite a lot, and going from std::vector<char> to std::vector<uint64_t> without copying the underlying data is, if not impossible, then at least annoying.

RaiiSpan

Enter another helper type: RaiiSpan

This uses a few things we've seen before, and one we haven't:

// RaiiSpan provides access to a raw "array" of type T that is
// owned by an `Allocator`.
//
// It automatically calls `Free()` on the `Allocator` upon destruction,
// avoiding memory leaks.
template <class T>
class RaiiSpan {
  // All instantiations of this template are friends!
  template <class OtherT>
  friend class RaiiSpan;
public:
  RaiiSpan(size_t size, Allocator* alloc)
    : ptr_(alloc->Alloc(size), AllocatorDeleter{alloc}),
      span_(reinterpret_cast<T*>(ptr_.get()), size) {}

  std::span<T>& Span() {
    return span_;
  }

  // Reinterpret the contents of the memory region `ptr` as `NewT`,
  // and return a corresponding RaiiSpan.
  //
  // This is only allowed if the size of `NewT` cleanly divides the
  // size of the memory region.
  //
  // Releases ownership of `ptr` to the new span that it returns.
  template<class NewT>
  RaiiSpan<NewT> Cast() && {
    const size_t total_bytes = span_.size() * sizeof(T);
    const size_t new_span_size = total_bytes / sizeof(NewT);
    assert(((void)("Size of source span is not a multiple of target type size"),
            (total_bytes % sizeof(NewT)) == 0));
    span_ = {};
    return RaiiSpan<NewT>(ptr_.release(),
                          new_span_size,
                          ptr_.get_deleter().alloc);
  }

private:
  std::unique_ptr<void, AllocatorDeleter> ptr_;
  std::span<T> span_;

      
  RaiiSpan(void* ts, size_t size, Allocator* alloc)
    : ptr_(ts, AllocatorDeleter{alloc}),
      span_(reinterpret_cast<T*>(ptr_.get()), size) {}

};

With this, I can read data into the char buffer that istream::read() wants first, then massage it until I have usable asset data.

.TTF loading

I couldn't tell you why I decided to start with the .TTF files, but it turned out to have been a good idea. Each .TTF file is just a contiguous array of a simple struct. Only one small problem: It's stored in big-endian byte order, but most modern computers are little-endian2.

Luckily, the struct in question is neatly packed:

struct TexTemplate {
  uint16_t tex16[16];
  uint16_t tex4[4];
  uint16_t tex1[1];
};

So I can use the helpers I've already got to knock out TTF loading really quickly:

auto LoadTtf(
    const std::filesystem::path& path,
    Allocator& allocator)
    -> util::RaiiSpan<TexTemplate> {
  auto int16_span = util::Slurp(
      path,
      allocator).Cast<uint16_t>();
  // Swap endianness
  for (uint16_t& x : int16_span.Span()) {
    x = ((x & 0x00ff) << 8) + ((x & 0xff00) >> 8);
  }
  return std::move(int16_span).Cast<TexTemplate>();
}

And of course, I have to test it. I'll spare you the grisly details, but basically, I took a random .TTF file, looked at the first and last TexTemplate struct in a hex editor, and treated the values I saw as "golden data" to match against.

Picking the first and last values in the array, with 20 others in between (my test file happens to have 22 entries), gives me reasonable confidence that my loader isn't applying any weird offsets. Until I get to the point where I actually do something with these files, that will have to be good enough.

Join me next time when, hopefully, I'll get through more than a single usable function in a whole blog post...

  1. wipEout was a PlayStation game first, and then ported to the PC.

  2. Certainly mine is.