hikari – Entity Component System II

I’d intended to use this second post to dive into turning some existing systems in hikari into components. However, after a few attempts at writing it up, I think converting an existing system makes for a really terrible way to explain component systems. So instead I’m going to go over a few common decisions that were showing up a lot and discuss them directly. We’ll get back to the actual implementation after this.

Also, before I go any further, I wanted to clarify a point from last time. ECS is an abstract design pattern. There are many ways of implementing it, just like there are a variety of ways to implement abstract data structures. (i.e, a list could be an array, a linked list, a linked list of array chunks, etc.) There are tradeoffs for every implementation decision, so I don’t claim I’m implementing things in The Right Way. That said, this is a record of my implementation, not a broad study of ECS in general, so I may not mention all of the tradeoffs.

Components and Systems

Continuing the theme of “things are not what they seem”, components don’t necessarily exist in ECS. Which is to say, if you have a Foo component, you may or may not actually have a FooComponent struct anywhere.

The reason for this is simple: it is exceedingly rare to have *only one* of a thing. The structure we store things in determine its memory layout. Memory layout decisions for a group of things should be made with respect to the entire group and how it is used. Therefore, these decisions are made when implementing the system for a particular component.

For example, here’s a few ways we could layout a TransformSystem that holds parent- and world-relative spacial transforms as well as a scene graph structure describing the relationship between instances.

struct TransformSystem {
    // Option:  Array-of-structs
    struct TransformComponent {
        Vector3    local_position;
        Vector3    local_scale;
        Quaternion local_rotation;
        Vector3    world_position;
        Vector3    world_scale;
        Quaternion world_rotation;

        u32 parent;
        u32 next_sibling;
        u32 prev_sibling;
        u32 first_child;
    };
    u32 count;
    TransformComponent * transforms;

    // Option:  Struct-of-arrays
    u32 count;
    Vector3 *    local_position;
    Vector3 *    local_scale;
    Quaternion * local_rotation;
    Vector3 *    world_position;
    Vector3 *    world_scale;
    Quaternion * world_rotation;
    u32 * parent;
    u32 * next_sibling;
    u32 * prev_sibling;
    u32 * first_child;

    // Option:  Array-of-structs, use-case split
    struct Transform {
        Vector3    position;
        Vector3    scale;
        Quaternion rotation;
    };

    u32 count;
    Transform * parent_from_local;
    Transform * world_from_local;
    u32 * parent;
    u32 * next_sibling;
    u32 * prev_sibling;
    u32 * first_child;
};

A lot of material about memory layout pushes “struct of arrays for everything!”, and that’s a good default strategy, but it isn’t necessarily the best way to go every time. What gets used together? What doesn’t? When you optimize for cache performance in the most common code path, how many cache misses do you cause in the *second* most common code path? In the example, struct-of-arrays makes a lot of sense when we’re burning through the data linearly (as we do when updating the world transforms), but means three separate cache misses to look at a specific transform on its own. Which is more important is going to depend on factors that aren’t obvious and may change over time (how many other systems poke at individual transforms, say.) Hence why it’s nice to have this detail localized to the system itself — we can change this at will as long as no one else depends on our exact data layout.

The freedom to choose our data layout per system goes beyond struct-of-arrays, however. Need to keep the data sorted? Go ahead. Need to store it in a spatial lookup structure? Sure. Want to keep components in their own heap-allocated structs? That works too.

The listing above is missing one important field that all components will need: the entity it is attached to. Systems interact directly with each other. If we know our host entity, we can use that to query other systems for components attached to that same entity. (We also need that reference to check if our host has been destroyed. Remember: we’re not storing a list of attached components anywhere!)

Component Lookup

One of the key things we need to be able to do is, given an entity, find the component instance(s) attached to it. Once again, this is something we solve separately per-system. For many systems, we’re going to want to move instances around in our internal data structures, to keep them tightly packed or enforce ordering requirements. Pointers and indices alone won’t be sufficient in this case, because the instance moves around in memory.

We need a lookup structure between Entity handles and instances. How we choose to do this is a tradeoff between extra memory overhead and runtime.

On the low-memory side, we could just search the array of Entity handles. Linear search is O(n) in the number of active instances.

Option<Instance>
GetInstance(Entity e) {
    Instance result = Instance::Null();
    for (u32 i = 0; i < this->count; ++i) {
        if (e == this->entities[i]) {
            result = Instance(i);
            break;
        }
    }
    // NOTE: Option<T> stores an instance of T and whether that value is
    // valid.  Accessing an invalid value from Option<T> will assert.
    // I use this to force handling of the "not found" case by the caller.
    return Option<Instance>(result, !result.IsNull());
}

We can improve this further by letting the caller pass in a previous instance handle — before searching, check if that instance and entity combination is still valid:

Option<Instance>
GetInstance(Entity e, Instance inst=Instance::Null()) {
    if (!inst.IsNull() && inst.index < count) {
         if (e == this->entities[inst.index]) {
            // Past instance is valid
            return Option<Instance>(inst, true);
        }
    }
    // No longer valid, need to search again:
    ...
}

If the system doesn’t store a huge number of instances, and instances don’t move around very often, this is a perfectly reasonable approach. No memory overhead (we need that entity list anyway), and constant time in the common case.

Of course, we’re not going to be so lucky with some systems. For example, we can reasonably expect that the *majority* of entities have a Transform — anything that exists spatially in the world needs one. Transforms are also used by a lot of different systems, so we need to perform the lookup often. Finally, for the sake of its own updates, Transforms have ordering requirements that mean they move around in the internal structures pretty often.

We can solve this one with a linear lookup table indexed by the entity index:

Option<Instance>
GetInstance(Entity e) {
    Instance result = Instance::Null();
    if (e.index < this->lookup_table.size()) {
        result = this->lookup_table[e.index];
        // Check for stale entity:
        if (this->entities[result.index] != e) {
            result = Instance::Null();
        }
    }
    return Option<Instance>(result, !result.IsNull());
}

Unlike the lookup table in the entity manager I discussed last time, there’s no strict requirement on this lookup table not moving around. Therefore, we can use std::vector (or equivalent), sized to hold the highest entity index we’ve ever been attached to. GetInstance can immediately reject entities beyond that point. This gives us O(n) memory overhead in the number of entities that exist in the world.

Somewhere between these two on the memory/runtime tradeoff is a hashtable mapping Entity handles to Instance handles. This requires less memory for component types that aren’t very common (no need to store null Instances for every entity), while having better asymptotic runtime than searching the Entity array. (Search may still win in practice for small enough systems.)

Multiple Instances

The above lookup functions only really work if each Entity can have only one of a particular component. This makes sense in some cases (what would it even *mean* to have multiple Transforms?) In many other cases, though, we’d really like to be able to use multiple instances per entity. For example, an explosion effect might have several different particle systems attached (smoke, rubble, sparks, etc.)

However, in this case, how do we reference a particular instance of a component from some other system? The host entity’s handle is no longer a unique identifier for the instance.

I don’t really have an answer I’m confident in, but my working theory is the system that allows multiple instances should handle this internally. Specifically, by expanding its data structures.

First, we add a unique id field to the Instance struct. This isn’t another index, just a monotonically increasing counter of instances created. (I am very explicitly *not* using the Nth instance attached to an entity as it interacts badly with adding / removing instances.) Next, we add two arrays to our instance data, one that points to the next instance assigned to the same entity, the next containing the instance id. This gives us a linked list of instances for a given entity — by putting this in its own array, we keep it out of the way during most operations, but increase the chance of hitting cache when we *are* iterating the list.

GetInstance() gets a bit more complex:

Option<Instance>
GetInstance(Entity e, Instance inst) {
    if (inst.IsNull()) {
        // Caller is expected to have a non-null instance already.  (As returned
        // by AddInstance(), etc.)  Otherwise we don't know the correct id.
        return Option<Instance>(inst, false);
    }

    if (inst.index < count) {
         if (this->entities[inst.index] == e && this->instance_ids[inst.index] == inst.id) {
            // Given instance reference is still valid.
            return Option<Instance>(inst, true);
        }
    }

    // Find the entity's first instance, using one of the methods above:
    Instance current = FindFirstInstance(e);
    while (current.id != inst.id && !current.IsNull()) {
        u32 next_index = this->next_instance_index[current.index];
        u32 next_id = this->instance_ids[next_index];
        current.id = next_id;
        current.index = next_index;
    }

    // Result is valid if the id matches the target.
    return Option<Instance>(current, current.id == inst.id);
}

I’m not a huge fan of this. More state to keep in sync, and potentially a lot of effectively useless overhead (i.e., if *most* entities with a FooComponent only have one, the next_instance_index array is going to be full of nulls.) On the other hand, we only pay a performance cost when we actually *use* multiple instances per entity.

¯\_(ツ)_/¯

Generalization

So, we have a variety of approaches to solve instance storage and lookup on a per-system basis, but do we really have to do this every time? Many systems are going to want to use the same approach.

I don’t really know. However, I suspect we can get away with sharing a lot of the lookup code, by storing the lookup data in its own struct, templated on the system’s Instance type. Actual component data management is unlikely to be easy to share, but we can write helper code for things like “allocate a bunch of arrays from a single data block”.

Okay, onward to actual code.  Next time.

hikari – Entity Component System I

Happy New Year, everyone.

So far, almost all of the work on hikari has been focused on rendering. In order to continue working on my terrain project, I need to flesh out more of the underlying engine systems. (To be perfectly clear: creating the impetus to work on these systems is part of why I choose this project to work on.) As a starting point, I want to set up an Entity Component System architecture for managing the simulation.

Why? Well, let’s look at what I have right now:

State of the World

The simulation state of the world is unceremoniously dumped in the World struct, which contains lists of meshes, particle systems, and other persistent data (i.e., sun direction.) Also worth noting, several important things are *missing* from this structure — the camera setup, for instance. This is largely an artifact of being hacked together with very little care while testing the renderer itself.

During rendering, the Renderer gathers a list of meshes by calling World::GetRenderMeshes(), which in turn does a lot of (ultimately unnecessary) work to splice together a list of RenderMesh structs from various sources. Terrain meshes and meshes loaded from OBJ files are stored by their various systems. This is less of a performance problem than it may seem (it only happens once a frame), but it’s still silly and ugly and bug prone.

Furthermore, there’s very little structure to the world state at the moment. Each RenderMesh contains a 4×4 world_from_local transform matrix and that is the only transform information known for any mesh. This works, but is less than ideal.

Entity Component System

My proposed solution to this problem is using an Entity Component System architecture to manage all of the simulation state. ECS is a pattern where each Entity in the world is composed of a collection of Components, each of which is processed by some System.

This is the point where most explanations of ECS seem to end. But I’ve explained the concept, not the implementation, and that’s important here. In a standard object-oriented model, you might implement ECS by having each entity store a list of references to components, that live in lists managed by the component’s system. This is certainly possible — both Invaders! and a second (abandoned) project on that engine used this method for entities. However, this led to nasty macros, code generation tools, and generally being a pain to work with. (A dynamic array *per entity*, eek!)

So. I’m not going to do that this time.

A key realization is that we basically never need a list of all components belonging to an entity (at runtime, at least.) So why store it? Systems which need to interact with each other can do so because they have the same entity ID.  Without that list of components, the Entity is just a unique identifier, a glorified integer. Or a foreign key, if you’re familiar with relational databases. Because that’s all an ECS is: a hyper-specialized relational database.

Of course, there are some potential problems introduced here. For example, if an entity doesn’t know all it’s components, how can it destroy all of them when the entity is destroyed? Well, it can’t. But we can solve this problem on a per-system basis — query the entity list to see if a particular component’s owner is still alive, for example. Or, set up a way to register callbacks on entity destruction, for systems where the live check is too much overhead. For components that are cheap to update, maybe we don’t even care.

Entities

Okay, enough theory, let’s talk about something concrete. I’m borrowing a fair bit of the structure of my implementation from Niklas Gray’s blog series about the entity system in Bitsquid / Stingray.

An Entity is actually two numbers that together form a unique ID — index and generation. I use 32-bit and 16-bit integers for these. The index is unique among all *currently live* entities. The generation indicates how many times that index has been used. If I destroy an entity, its index is open for reuse, but the next entity created with that index will have a different generation.

The generation value is also stored in a list owned by the EntityManager. This list is indexed by the entity index. A given Entity value is valid if and only if its generation is equal to the generation value stored in the EntityManager at that index. This is not perfect — there are a finite number of bits in the generation value and it will eventually roll over and repeat. However, by increasing the number of bits, we can ensure this is *highly unlikely* in practice.

The EntityManager has three main operations: creating entities, destroying entities, and checking whether an entity is alive. Of these, the liveness check is by far the most common operation (any component that wants to ensure its owner is alive needs to call this!) In order for the IsAlive() call to be both fast and thread-safe, the generation lookup table can never be reallocated.

Easy enough, I’ll preallocate it. I need one slot for each possible (32-bit) entity index and each value is 16 bits, which gives us an array size of… 8 gigabytes.

Believe it or not this is fine. Because I’m going to cheat.

The important thing is that the address of the lookup table never changes, and that it can grow — without moving — to support however many entities end up being used. Instead of fully allocating the full size, we can instead use OS-specific calls to reserve that much *address space*. While the typical PC has somewhere between 2-16GB of physical memory, the address space of a 64-bit process is *enormous*. 8GB of address space is nothing. As more memory is needed, we can simply commit more pages in that range. And should we ever run on a PC with terabytes of RAM, there’s no artificial limitation keeping us from using all 4-billion or so entities.

No real code listings here, as most of it is uninteresting. Next time, we’ll look at some actual component systems, which are where the real benefits show up.