Monday, November 21, 2016

Putting the fun back in C with array slicing and dicing (revisited)

I’ve been writing a lot about putting the fun back in C by leveraging C++. Using strings, arrays, and dictionaries you can really speed up development time, as it makes C act as a scripting language, while maintaining a high level of performance.
Last July I wrote about array slices, and implemented a copy-on-write array that uses a shared_ptr to point at the actual data. Because shared_ptr, the array can be sliced and sliced again, and they all share the same backing store. This is great for getting scripting language-like behavior, but now I want to get back to game programming. Game code asks for a different style of programming—more crude, closer to the metal. Therefore we will revisit arrays and slices, and implement a new slice class.

To be truthful, for the implementation of such a slice class I had a good look at how slices work in the Go language. In Go, an array is a very static and bare-bones entity: it’s just a sequence of bytes. The data structure to actually work with, is the slice. A slice has a pointer to the array, and it knows the size of the array (capacity). On top of that, it has a length so you can use the ~~array~~ slice as a (string) buffer, stack, etcetera. An image tells a thousand words:

If you want to create a new slice plus backing array, it has to be allocated with make(), like so:

// this is Go code
s := make([]byte, 256)

Note however that the size is fixed, and the array does not grow beyond its limits. For a language like Go I find it quirky and primitive, but this is exactly what I need for my game code. And that code is in C, so here we go.

template <typename T>
struct slice {
    uint len, cap;
    T *data;
    bool owner;

    ...

    T& operator[](int idx);
    void append(const T& elem);
    T& pop(int idx=-1);
    slice<T> operator()(int start, int end) const;
    void clear(void);
};

It’s ironic how there is nothing in Go that you can not do in C++.
Caveat: We stuck a raw pointer in there. Ideally, it should be a shared_ptr. Unfortunately it’s a rats nest because C++ shared_ptr doesn’t properly handle shared arrays. Grunt, recall that I just said that there is nothing you can not do in … (!)
A clean solution would either be based on std::vector, or be just like the array class that we already wrote.
My actual code doesn’t care so much and deals with it more akin to unique_ptr; it tracks the ownership of the backing array via a bool. It is left up to the programmer herself to make sure that once the “owning” slice has been destroyed, any subslices can no longer be used. It’s not perfect, but this compromise works out pretty well in practice.

We can easily mimic the len() and cap() functions:

template <typename T>
uint len(const slice<T>& s) {
    return s.len;
}

template <typename T>
uint cap(const slice<T>& s) {
    return s.cap;
}

Note how len() and cap() could also be implemented as friend functions. For now, I’m writing sloppy C++ code and I don’t care about encapsulation one bit. More on that later.

And we can mimic Golang’s make() function:

template <typename T>
slice<T> make(uint n) {
    assert(n != 0);
    return slice<T>(new T[n], 0, n, true);
}

Finally, there is a copy() function. I don’t have much use for it, really. I think that’s mostly because game code doesn’t need slice or array copies.

template <typename T>
void copy(slice<T>& dest, const slice<T>& src) {
    // basically this, but ... read below
    for(uint i = 0; i < n; i++) {
        dest[i] = src[i];
    }
}

In this case, using operator=() is important (opposed to memcpy()) because we may be dealing with C++ objects rather than just raw bytes. (You may use template specialization for an optimized case when dealing with POD types).
Also mind that slices may overlap. If the destination overlaps with the source, then you should do a backwards copy because it will overwrite elements in the backing array.

On the surface, the implementation of our slice class is similar to the array class that we previously wrote. The static nature of the backing array changes the game though, and it handles very differently in practice.

Game code asks for a rudimentary approach. This style of code takes some shortcuts so it doesn’t like to bother with shared_ptr (although you can, as noted), OOP intrinsics such as public/private membership, and it uses assert() rather than “always-on” runtime bounds checking. It simply bails out when something is amiss.
Whatever the code style, you are always going to need data structures and a pleasant API; the programmer will always want to have good tooling at hand to get the job done. C++ is a terribly over-complicated language, but it’s a good fit for API-building and putting the fun back in C.