The Developer’s Cry

a blog about computer programming

A generic Vec type in plain C (addendum)

Conspiciously missing in C are container classes. In the previous post we developed a generic Vec type in C by using a method that somewhat resembles having templates in plain C. As much as I like the clever hackery, there are a few wrinkles to be ironed out. Let’s have a close look at how we can fix that.

The Vec type has two main shortcomings:

The vector declaration macro is:

#define Vec(T)    Vec_##T

#define decl_Vec(T)         \
    typedef struct {        \
        T* data;            \
        size_t cap, len;    \
    } Vec(T)    /* <-- can not have "*" pointer there */

You can not have an asterisk there because it is a syntax error in the type declaration. Moreover it causes syntax errors in function declarations down the line. This can be circumvented with a typedef:

typedef Person* Person_ptr;

Vec(Person_ptr)* v = new_Vec(Person_ptr)();
push_Vec(v, new_Person("John"));

There, fixed that for you. But not so fast. There are implications to having a Vec of pointers that we need to address.

Dynamic typing

C has the neat feature that a void pointer can be any type of pointer. This means that a Vec(voidptr) can hold any type of pointer without the compiler complaining about it.

typedef void* voidptr;

Vec(voidptr)* v = new_Vec(voidptr)();
push_Vec(v, new_Person("John"));

This is interesting because we now have a type-unsafe vector, which acts like in a dynamic language; you can even mix pointer types in there.

We practically get dynamic typing for free, and we should be jumping for joy at this point. At the same time mixing types is irresponsible and dangerous, as we will see next.

Ownership

Previously we defined a function delete_Vec() that frees the memory occupied by the vector. A better name would have been free_Vec(). The free_Vec() function does a shallow free that only frees the storage of the container itself.

When you start putting pointers into vectors, you have to start caring about ownership; does the vector contain only raw pointers, or does it own the memory they are pointing to? The difference is important when it comes to finalizing and freeing the memory;

/* non-owned, raw pointers in Vec v */
Vec(voidptr)* v = new_Vec(voidptr);

const char* p = "hello!";
push_Vec(voidptr)(v, p);

free_Vec(voidptr)(v);

[This bit of code actually produces a compiler warning about the const qualifier. I’m just going to cheat and disregard const here].

The pointers in this Vec are just pointers. In the end we can free the space occupied by the container itself, and we’re done. The next code example uses “owned” pointers to allocated memory:

/* owned pointers in Vec v */
Vec(voidptr)* v = new_Vec(voidptr);

char* p = malloc(16);
strncpy(p "hello!");
p[6] = '\0';
push_Vec(voidptr)(v, p);
p = NULL;        /* code hygiene */

/* free_Vec(voidptr)(v);   <-- memory leak */
delete_Vec(voidptr)(v, free);

Freeing only the container itself would cause memory to leak, and therefore we have to introduce a delete_Vec() function that accepts a deleter function that can delete or free each container item. The deleter function for a character string allocated with malloc() is simply free().

Because we have to write our pseudo-template function in C as a macro, it becomes something like this:

#define delete_Vec(T)   delete_Vec_##T

#define impl_Vec(T)     \

    ... all the Vec functions are in here ...

    void delete_Vec(T)(Vec(T)* v, void(*deleter)(T)) {  \
        for (size_t i = 0; i < v->len; i++) {           \
            deleter(v->data[i]);                        \
        }                                               \
        free_Vec(T)(v);                                 \
    }

The deleter function is a destructor for all items in the vector. Note that the deleter takes an argument of type T. If you choose to dump typed pointers into a Vec(voidptr), then the deleter function must also take an argument voidptr to prevent compiler warnings. This is normal and a small price to pay for taking advantage of using voidptr;

void delete_Person(void* v) {
    Person* p = v;
    free(p->name);
    free(p);
}

...

Vec(voidptr)* v = new_Vec();
push_Vec(voidptr)(v, new_Person("John"));
push_Vec(voidptr)(v, new_Person("Connor"));

delete_Vec(voidptr)(v, delete_Person);

Note that the deleter applies to all items in the Vec. Therefore it’s risky to mix types in a Vec(voidptr).

The type-safe and better solution is to always specify the type. It involves doing a little more work instantiating the templated type:

typedef Person* Person_ptr;
decl_Vec(Person_ptr);
impl_Vec(Person_ptr);

void delete_Person(Person* p) {
    free(p->name);
    free(p);
}

...

Vec(Person_ptr)* v = new_Vec();
push_Vec(Person_ptr)(v, new_Person("John"));
push_Vec(Person_ptr)(v, new_Person("Connor"));

delete_Vec(Person_ptr)(v, delete_Person);

Now that we have a deleter function, we can even make a Vec of Vec_ptr.

Good stuff

I’m generally not too fond of the need for “ptr” types when we usually have the asterisk syntax for pointers. That said, I’m happy with how Vec turned out. Other options for having a growable array in C are usually not as elegant; they typically involve macros with unpleasant side-effects, while still requiring typecasts in the user code—take for example glib’s GArray.

In languages like C++ and Rust the native vector type handles more naturally, but this implementation is as good as it gets in C. Making an array of pointers and dealing with ownership is nothing new, but combined with a destructor-like deleter and a template-like syntax our generic Vec type in C is a solid reproduction.

I hope you can see it’s a small step from here to move the deleter function into the “object”, at which point we start morphing into C with classes.

Writing template functions as macros is still cumbersome, and having a real template preprocessor for C is still (kind of) on the wishlist.