C++ STL-compatible iterators
Last time we had a look at the deque, a container class that basically is a linked list of arrays. We made our own implementation, and one thing missing there was an accompanying iterator class. Iterators have become so ingrained in C++ coding styles, that it’s only logical that custom containers must support STL-compatible iterators. As you might know I’m not the greatest fan of the STL (C++ Standard Template Library), but even so I wouldn’t want to burden anyone with having to deal with non STL-compatible iterators. Therefore we will embrace it for once (a little) and write iterators, done the STL way.
An iterator is an object that allows you to easily loop over each item in a container. There are many kinds of container classes, and each has its own way of organizing its contents internally. The iterator abstracts away the details of accessing items, allowing to access containers in a generic way. No matter whether the container at hand internally is an array, linked list, binary tree or what not, the C++ idiom to loop over all items is:
MyContainer<int> c = ...;
for (auto it = c.begin(); it != c.end(); ++it) {
int item = *it;
...
}
// or even, in C++11:
for (int item : c) {
...
}
Also note the use of the generic auto
type to avoid having to declare the
exact, complicated iterator type.
This generic programming trick opens up the way to STL algorithms: a large
collection of convenience functions for containers; sorting, searching,
copying, transforming, and more. Just by adding an iterator to class
MyContainer
, we can now sort and search MyContainers. How cool is that?
Although iterators do add another level of abstraction (and thus another
level of indirection), the benefits outweigh the cost.
How to implement an iterator? We will start out simple and build it from there.
The iterator class is basically a pointer into the container. It implements
certain operators, most importantly operator++()
(increment position) and
operator*()
(get item at position). Since the iterator works in tandem with
the container, its class declaration will be nested in the container class:
template <typename T>
class MyContainer {
class iterator {
T *p;
public:
// constructors
iterator() : p(nullptr) { }
iterator(T *start) : p(start) { }
~iterator() { }
// equality
bool operator==(const iterator& o) const {
return p == o.p;
}
// inequality
bool operator!=(const iterator& o) const {
return !operator==(o);
}
// pre-increment position
iterator& operator++() {
if (p == nullptr) {
// ignored; no-op
return *this;
}
// FIXME we never detect the end
p++;
return *this;
}
// get item reference at
T& operator*() const {
assert(p != nullptr);
return *p;
}
};
T *data;
...
};
Class MyContainer
has a pointer to some allocated memory for storing items.
The iterator is declared as a nested class, which has a raw pointer directly
into MyContainer
’s space. Note that in this case I use a pointer, but
you might as well use an integer as an index here—it’s easy to think of
iterators as abstracted pointers, so I think it works better as an example.
We check for nullptr
, and I use assert
to bail out on an invalid operation.
You may also choose to throw an exception, but there seems to be no
standard way of doing this, probably use std::runtime_error
or so.
There are various other comments to make about this code, but for simplicity’s sake, this is the bare skeleton for beginning an iterator class. Indeed, iterator classes may become several pages long depending on what functionality needs to be included.
To create an iterator instance we must call the begin()
or end()
method:
template <typename T>
class MyContainer {
class iterator {
...
};
T *data;
public:
...
iterator begin(void) {
return iterator(data);
}
iterator end(void) {
return iterator();
}
};
So, begin()
creates an iterator to the beginning of data
, while end()
creates a “null” iterator. Thus we can write:
// loop over all items
for (auto it = c.begin(); it != c.end(); ++it) {
item = *it;
}
We should improve our implementation of the iterator, and add some more operators:
// copy constructor
iterator(const iterator& o) : p(o.p) { }
// move constructor
iterator(iterator&& o) : p(o.p) {
o.p = nullptr;
}
// copy assignment
iterator& operator=(const iterator& o) {
if (this == &o) {
return *this;
}
p = o.p;
return *this;
}
// move assignment
iterator& operator=(iterator&& o) {
p = o.p;
o.p = nullptr;
return *this;
}
// post-increment
iterator operator++(int) {
iterator old(*this);
this->operator++();
return old;
}
// dereference
T* operator->() const {
assert(p != nullptr);
return p;
}
Pay extra attention to the return type (and value) of the post-increment operator: it returns a copy of the iterator as it was before the increment was done.
With all of this in place, we now have a so-called “forward iterator”. It can be made bidirectional by adding pre- and post-decrement operators.
In its current state it’s a pretty decent iterator class, but still
isn’t STL-compatible. Making it STL-compatible is important for getting it
to work with C++ <algorithm>
. To make it work, just copy-paste from
the boilerplate:
#include <iterator>
class iterator : public std::iterator<std::forward_iterator_tag, T> {
T *p;
public:
typedef T value_type;
typedef T& reference;
typedef T* pointer;
typedef int difference_type;
typedef std::forward_iterator_tag iterator_category;
...
};
Finally, we should be able to do something useful, like this:
#include <algorithm>
void example1(MyContainer<int>& c) { // or: const&
// find element
auto it = std::find(c.begin(), c.end(), 10);
if (it != c.end()) {
// found value: *it
} else {
// not found
}
}
void example2(MyContainer<int>& c) {
// sort all items
std::sort(c.begin(), c.end());
}
The first example works, but the second doesn’t. Alas, std::sort()
requires a random-access iterator. A random-access iterator must implement
all of the following:
- copy-construct and copy-assign;
- equality and inequality;
- pre- and post-increment;
- pre- and post-decrement;
- arithmetic add and subtract;
- compound assignment for add and subtract;
- all comparison operators;
- dereferencing;
- indexing
It should be clear that writing a custom random-access iterator is, simply put, a lot of work. Right about here is where it ends being fun, but let me ramble on a little more for sake of completeness.
We haven’t yet touched upon constness. C++ is very picky about objects
being const
. It’s not just being pedantic, it’s important for code
correctness as it allows the compiler to do certain optimizations if it knows
that an object is not going to change state along the way. Therefore we should
implement const_iterator
, which is the ‘const’ version of iterator
.
A technique to avoid code duplication is to first write const_iterator
,
and then have iterator
inherit from const_iterator
:
class const_iterator : public std::iterator<std::forward_iterator_tag, T> {
T *p;
public:
typedef T value_type;
typedef T& reference;
typedef const T& const_reference;
typedef T* pointer;
typedef const T* const_pointer;
typedef int difference_type;
typedef std::forward_iterator_tag iterator_category;
const_iterator() : p(nullptr) { }
...
const_iterator& operator++() { ... }
const_iterator operator++(int) { ... }
const T& operator*() const { ... }
const T* operator->() const { ... }
};
// the non-const version
class iterator : public const_iterator {
public:
typedef T value_type;
typedef T& reference;
typedef T* pointer;
typedef int difference_type;
typedef std::forward_iterator_tag iterator_category;
// can not use member initializers here :(
iterator() {
p = nullptr;
}
...
// reuse code from const_iterator whenever you can
using const_iterator::operator==;
using const_iterator::operator!=;
iterator& operator++() {
const_iterator::operator++();
return *this;
}
iterator operator++(int) {
iterator old(*this);
this->operator++();
return old;
}
T& operator*() const { ... }
T* operator->() const { ... }
};
...
const_iterator cbegin(void) const {
return const_iterator(data);
}
const_iterator cend(void) const {
return const_iterator();
}
Note how pointer p
in const_iterator
is not ‘const’ by itself.
This trick allows us to be liberal and easily call const_iterator
code
from the non-const iterator
code.
Lastly, bidirectional iterators allow for traversal in reverse. You may create a special reverse iterator for this. The reverse iterator can be generated from a standard template, all you need to do is add one magic line to the container class:
typedef std::reverse_iterator<iterator> reverse_iterator;
In theory you can also create a const_reverse_iterator
, but unfortunately
it only gave me compiler errors. I suspect an issue with the STL on
my dev platform, or maybe it happens because we took some liberties with
const_iterator
. There isn’t a lot of information to be found the web about
const_reverse_iterator
, which makes me even more suspicious.
Personally, I find reverse iterators to be confusing. Why not simply steer the
bidirectional iterator in opposite direction, and avoid reverse iterators
altogether.
All in all, iterators are a pretty complicated subject. Easy on the outside, but unexpectedly difficult on the inside. When done right, it gets quite tedious and full of C++ gagh. Just for your amusement, I have left in a glaring bug (see the FIXME remark) in order to show that writing iterators requires careful planning and minute testing.