A generic Vec type in plain C (addendum)
Conspiciously missing in C are container classes.
In the previous post we developed a generic Vec
type in C by using
a method that somewhat resembles having templates in plain C.
As much as I like the clever hackery, there are a few wrinkles to be
ironed out. Let’s have a close look at how we can fix that.
The Vec
type has two main shortcomings:
- writing “template” functions as macros is cumbersome;
- the type declaration macro expands such that we can not use pointers
as type
T
;
The vector declaration macro is:
#define Vec(T) Vec_##T
#define decl_Vec(T) \
typedef struct { \
T* data; \
size_t cap, len; \
} Vec(T) /* <-- can not have "*" pointer there */
You can not have an asterisk there because it is a syntax error in the type declaration. Moreover it causes syntax errors in function declarations down the line. This can be circumvented with a typedef:
typedef Person* Person_ptr;
Vec(Person_ptr)* v = new_Vec(Person_ptr)();
push_Vec(v, new_Person("John"));
There, fixed that for you. But not so fast. There are implications to having
a Vec
of pointers that we need to address.
Dynamic typing
C has the neat feature that a void
pointer can be any type of pointer.
This means that a Vec(voidptr)
can hold any type of pointer without
the compiler complaining about it.
typedef void* voidptr;
Vec(voidptr)* v = new_Vec(voidptr)();
push_Vec(v, new_Person("John"));
This is interesting because we now have a type-unsafe vector, which acts like in a dynamic language; you can even mix pointer types in there.
We practically get dynamic typing for free, and we should be jumping for joy at this point. At the same time mixing types is irresponsible and dangerous, as we will see next.
Ownership
Previously we defined a function delete_Vec()
that frees the memory
occupied by the vector. A better name would have been free_Vec()
.
The free_Vec()
function does a shallow free that only frees the storage
of the container itself.
When you start putting pointers into vectors, you have to start caring about ownership; does the vector contain only raw pointers, or does it own the memory they are pointing to? The difference is important when it comes to finalizing and freeing the memory;
/* non-owned, raw pointers in Vec v */
Vec(voidptr)* v = new_Vec(voidptr);
const char* p = "hello!";
push_Vec(voidptr)(v, p);
free_Vec(voidptr)(v);
[This bit of code actually produces a compiler warning about the const
qualifier. I’m just going to cheat and disregard const
here].
The pointers in this Vec
are just pointers. In the end we can free
the space occupied by the container itself, and we’re done.
The next code example uses “owned” pointers to allocated memory:
/* owned pointers in Vec v */
Vec(voidptr)* v = new_Vec(voidptr);
char* p = malloc(16);
strncpy(p "hello!");
p[6] = '\0';
push_Vec(voidptr)(v, p);
p = NULL; /* code hygiene */
/* free_Vec(voidptr)(v); <-- memory leak */
delete_Vec(voidptr)(v, free);
Freeing only the container itself would cause memory to leak, and therefore
we have to introduce a delete_Vec()
function that accepts a deleter function
that can delete or free each container item. The deleter function for a
character string allocated with malloc()
is simply free()
.
Because we have to write our pseudo-template function in C as a macro, it becomes something like this:
#define delete_Vec(T) delete_Vec_##T
#define impl_Vec(T) \
... all the Vec functions are in here ...
void delete_Vec(T)(Vec(T)* v, void(*deleter)(T)) { \
for (size_t i = 0; i < v->len; i++) { \
deleter(v->data[i]); \
} \
free_Vec(T)(v); \
}
The deleter function is a destructor for all items in the vector.
Note that the deleter takes an argument of type T
. If you choose to
dump typed pointers into a Vec(voidptr)
, then the deleter function
must also take an argument voidptr
to prevent compiler warnings.
This is normal and a small price to pay for taking advantage of using
voidptr
;
void delete_Person(void* v) {
Person* p = v;
free(p->name);
free(p);
}
...
Vec(voidptr)* v = new_Vec();
push_Vec(voidptr)(v, new_Person("John"));
push_Vec(voidptr)(v, new_Person("Connor"));
delete_Vec(voidptr)(v, delete_Person);
Note that the deleter applies to all items in the Vec
. Therefore it’s
risky to mix types in a Vec(voidptr)
.
The type-safe and better solution is to always specify the type. It involves doing a little more work instantiating the templated type:
typedef Person* Person_ptr;
decl_Vec(Person_ptr);
impl_Vec(Person_ptr);
void delete_Person(Person* p) {
free(p->name);
free(p);
}
...
Vec(Person_ptr)* v = new_Vec();
push_Vec(Person_ptr)(v, new_Person("John"));
push_Vec(Person_ptr)(v, new_Person("Connor"));
delete_Vec(Person_ptr)(v, delete_Person);
Now that we have a deleter function, we can even make a Vec
of Vec_ptr
.
Good stuff
I’m generally not too fond of the need for “ptr” types when we usually
have the asterisk syntax for pointers. That said, I’m happy with how Vec
turned out. Other options for having a growable array in C are usually
not as elegant; they typically involve macros with unpleasant side-effects,
while still requiring typecasts in the user code—take for example
glib’s GArray
.
In languages like C++ and Rust the native vector type handles more naturally,
but this implementation is as good as it gets in C.
Making an array of pointers and dealing with ownership is nothing new,
but combined with a destructor-like deleter and a template-like syntax
our generic Vec
type in C is a solid reproduction.
I hope you can see it’s a small step from here to move the deleter function into the “object”, at which point we start morphing into C with classes.
Writing template functions as macros is still cumbersome, and having a real template preprocessor for C is still (kind of) on the wishlist.