The Developer’s Cry

a blog about computer programming

C with objects

In my quest of creating a generic Vec type in plain C (see the September posts of this blog) I ended up with a delete_Vec() that requires an item-deleter argument. The item-deleter is a handler function that deletes individual items from the Vec when the entire thing gets destroyed. However, matters get more complicated when nesting containers in containers. While it certainly is possible to make a Vec-of-Vec in this scheme, the item-deleter (at the highest level of nesting) now must have knowledge about dealing with the nested containers. The design is not fully “orthogonal”, and this leaves us with a nagging feeling that the generic Vec is still missing something. In the concluding remarks I already mentioned that it would be a small step to move the deleter into the “object”, at which point we start morphing into C with classes. Let’s see how that pans out.

Consider an example structure that represents a person:

typedef struct {
    OBJECT;
    char* name;
    int age;
} Person;

Here we have tagged the struct with a special macro OBJECT. What is that thing? The macro expands to a pointer member, which points at a class definition. A class is a description of the type; how is this struct initialized and deinitialized?

#define OBJECT  ObjectClass* objclass_

typedef struct {
    const char* name;
    size_t size;
    void (*init_func)(void*);
    void (*drop_func)(void*);
} ObjectClass;

The initializer function is named init. When we create an object, we do alloc and init.

#define create(T)       create_object(&objectclass_##T##_)
#define init(x,T)       init_object((x), &objectclass_##T##_)

void init_object(void* v, ObjectClass* c) {
    panic_if(v == NULL);
    panic_if(c == NULL);

    // zero out all struct members
    memset(v, 0, c->size);

    // the object instance knows what class it is
    // (it is the first struct member)
    *(ObjectClass**)v = c;

    // initialize the instance
    if (c->init_func != NULL) {
        c->init_func(v);
    }
}

void* create_object(ObjectClass* c) {
    panic_if(c == NULL);

    // create is alloc + init

    void* obj = malloc(c->size);
    panic_if(obj == NULL);

    init_object(v, c);
    return obj;
}

Deinitialization is named drop. Note that drop by itself only empties the object; it does not deallocate it. In order to deallocate (the counterpart to create), call destroy.

void drop(void* v) {
    panic_if(v == NULL);

    ObjectClass* c = *(ObjectClass**)v;
    panic_if(c == NULL);

    // deinitialize the instance
    if (c->drop_func != NULL) {
        c->drop_func(v);
    }
}

void destroy(void* v) {
    if (v != NULL) {
        drop(v);

        // for safety, reset the object's class
        *(ObjectClass**)v = NULL;

        free(v);
    }
}

The declaration and implementation of the objectclass for our Person are eased by macros:

#define decl_Object(T)  \
    void init_##T(void*);                   \
    void drop_##T(void*);                   \
    extern ObjectClass objectclass_##T##_

#define impl_Object(T)  \
    ObjectClass objectclass_##T##_ = {      \
        .name = OBJECTCLASS_AS_STRING(T),   \
        .size = sizeof(T),                  \
        .init_func = init_##T,              \
        .drop_func = drop_##T,              \
    }

decl_Object(Person);

impl_Object(Person);

What is left, is to write functions init_Person() and drop_Person(). These functions are straightforward. The only caveat is that they take void pointer arguments rather than Person pointers. The reason is that create/destroy can work with any type of object.

void init_Person(void* v) {
    panic_if(v == NULL);

    Person* p = v;
    p->name = NULL;
    p->age = -1;
}

void drop_Person(void *v) {
    panic_if(v == NULL);

    Person* p = v;
    if (p->name != NULL) {
        free(p->name);
        p->name = NULL;
    }
    p->age = -1;
}

With the plumbing in place, we can write some user code:

Person* p = create(Person);

p->name = string_from("Joe Jackson");
p->age = 42;

printf("Person: %s (%d)\n", p->name, p->age);

destroy(p);

Although maybe underwhelming, notice how we have just handcrafted a foundation for a hypothetical new framework.

Da heap versus da stack

The previous example allocated the object from the heap. We can also place objects on the stack. Mind that C does not automatically zero-initialize the variable, we must manually call the object’s init and drop.

Person p;
init(&p, Person);

p.name = string_from("Joe Jackson");
p.age = 42;

printf("Person: %s (%d)\n", p.name, p.age);

drop(&p);

Manually calling init and drop is tedious and prone to fail. For some reason it feels less natural than when pairing create/destroy.

You may possibly avoid stack-allocations altogether (as a form of code style), just be aware that there’s a performance penalty when doing everything on the heap.

Copy, Move

Besides init and drop, it is convenient to have a copy function that makes a deep copy of an instance of an object. Copying an object is relatively expensive in situations where you could discard the original (e.g. placing the object into an array). Then it would be better to move the object instead. Adding copy and move to our framework means that we now have to implement four functions per class. I call it “my rule of four”.

Evolution

Obviously we are mimicking how C++ works internally. Why are we doing this, again? A generic container requires that its items can be destroyed in a generic way. A way of realizing that, is to introduce the concept of a destructor. My initial aim never was to take it in this direction. It’s funny how object-oriented programming naturally emerges from the desire of having generic containers.

Interestingly, in languages like Python/Go/Rust you write the constructor, but never the destructor. The destructor is missing in action because these languages do automatic memory management.

There is no denying that mimicking C++ in plain C is convoluted. Strictly speaking, you don’t need object-oriented programming to implement generics.

Back to our Vec implementation for a moment; rather than passing an item-deleter, the Vec could have an item-deleter. This way a Vec always knows how to delete its items [at this particular level of nesting]. It is a simple solution that does not require reinventing C++.

Nevertheless it was neat figuring out C with objects.