Ned Bingham
edward.bingham@yale.edu
Resume, Curriculum Vitae

stdcore: a Toy Container Library

https://github.com/c-cores/stdcore.git

The functionality provided by the algorithm header in the C++ Standard Library is indispensable, but ultimately painful to use. Code that should take only one line ends up taking quite a bit more and is much less readable. This happens for two reasons. First, every function requires an iterator to the first and last elements for each range it operates on. Suppose there were a function foo() that returns a container. This property makes it impossible to call successive algorithms as in x = *max_element(foo()). This instead turns into:

auto y = foo();
auto x = *max_element(y.begin(), y.end());

Second, algorithms only operate on iterators which are unable to add or remove elements to the container. This means that algorithms that normally add or remove elements, like unique(), copy algorithms, merge(), set_difference(), set_difference(), and others, cannot do so. This turns something as simple and common as x = unique(sort(y)) into the following trainwreck.

array<int> x = y;
sort(x.begin(), x.end());
x.erase(unique(x.begin(), x.end()), x.end());

Finally, something that is a little less of a problem and a little more of a pet peeve. Whenever I need to compare two different containers I have to define the comparison functions in my own code and they always end up the same. For example operator <() tends to look something like this:

iterator i = a.begin();
iterator j = b.begin();
for (; i != a.end() && j != b.end(); i++, j++)
{
	if (*i < *j)
		return true;
	else if (*i != *j)
		return false;
}

return j != b.end();

An Alternative

So, after a few years of dealing with this, I developed a toy container library to test out a hunch. I want to emphasize that this is a toy library, not production ready. It has gone through quite a few architectural changes and I haven't been particularly thorough or consistent. It is neither well tested nor is it well documented. It serves only as a proof of concept. That said, it is fairly functional so have fun!

Power to the Iterator

I decided upon giving iterators power to add and remove elements from their parent structure. So, now every iterator must have a pointer to its parent structure and few basic functions. For an iterator i:

This ends up being a very clean interface for interacting with containers. Instead of the Standard library's v.insert(i, x.begin(), x.end()), we get i.append(x). Further, unique(x.begin(), x.end()) can now resize the container as necessary.

Keep in mind, there are a few trade-offs being made with this interface. First, each iterator has to maintain a pointer to its parent which needs to be dereferenced for these functions. Second, calling one of these functions will invalidate other iterators to the same container. This is the same as in the Standard Library, but a more iterator-centric interface makes this property a little less intuitive. Third, there are now three permission levels instead of two. An iterator that can change element values and add or remove elements, an iterator that can only change element values, and an iterator that cannot do either. I've only implemented the full and zero permission iterators.

Templated Algorithms

Next, I needed to combine the first and last iterator into a single input. I started by creating templated algorithm definitions in algorithm.h, search.h, sort.h, and filter.h that accept take a whole container as input. Now our code is looking much cleaner.

z = unique(sort_quick(x));
unique_inplace(sort_quick_inplace(y));

Generic Slices

The algorithm implementations are clean enough, but I also needed a way to call those algorithms on parts of a container as in the Standard C++ Library. I needed a range container so that I could work with a range of iterators. However, calling sort on a range of iterators tries to sort the range by comparing the iterators which is not the goal.

Enter generic slices. A slice wraps a container of iterators providing a direct container-like iterface to the iterated values. For example, I can now have slice<range<array<int>::iterator> > that implements the more prototypical slice or slice<list<array<int>::iterator> > that implements a generic selection of elements in an array using a list as intermediate storage of the iterators. If I call sort on one of these containers, it calls swap from and compares the values pointed to by the internal array<int>::iterator.

With a little bit of template magic, these slices can be made to function correctly with pointers as well as iterators. Now I can wrap a static array and treat it as a container. For example, a C string can be wrapped like this:

slice<range<char*> >(range<char*>(cstr, cstr+strlen(cstr)))

However, we still have a usability problem because I now have to type this monstrosity out every time I want to work with a slice.

slice<range<array<int>::iterator> >(range<array::iterator>(x.begin(), x.end()));

Little Helpers

To fix this, I need some helper functions. First, there is a series of sub functions in both the iterators and the main container class which returns a slice<range<iterator> >. For a container c and iterator i:

Second, there is a sample function in the main container class that use a container of values to as indexes into another container. This allows me to create generic slices without going back to the slice constructor. The only time I do need to type out the slice template is when I declare one because I want to keep it around for later. An example usage of sample:

array<float> x = array_t(5, 3.5f, 2.2f, 9.5f, 3.0f, 2.5f);
cout << array_t(3, 0, 4, 2).sample(x) << endl;
// {3.5, 2.5, 9.5}

Finally, for convenience I created two further functions. deref wraps a container of iterators with a slice interface and ref strips the slice interface from a slice leaving the underlying container of iterators.

The Pet Peeve

Now we have solved everything but the pet peeve. The best solution would simply use an interface (as exists in Go) to define the comparison functions. However, C++ doesn't have that kind of interface. So, we are left with three possible solves. The first is to create a header file that includes every container header and creates pairwise comparison operators for those containers. However, this bloats your code by forcing you to include every container header and breaks down as soon as you add a new container. The second method is to create a templated comparison function, but there isn't a good mechanism in C++ to ensure that it doesn't conflict with other comparison operators. The method I implemented uses an abstract base class which is the closest equivalent to interfaces in C++.

So, I created an abstract base class for all containers. However, because each container defines its own iterator classes, I had to template this base class against those differing iterator class definitions. This makes container definitions a little bit ugly.

template <typename value_type>
struct array : container<value_type, array_iterator<value_type>, array_const_iterator<value_type> >
{
	...
};

However, it also allows me to define templated functions that can only be applied to containers. Any new containers I define will automatically work with these functions, and I don't need to include a giant header.

template<typename v0, typename i0, typename ci0, typename v1, typename i1, typename ci1>
int compare(const container<v0, i0, ci0> &a, const container<v1, i1, ci1> &b)
{
	ci0 i = a.begin();
	ci1 j = b.begin();
	for (; i && j; i++, j++)
	{
		if (*i < *j)
			return -1;
		else if (*j < *i)
			return 1;
	}
	if (j)
		return -1;
	else if (i)
		return 1;
	else
		return 0;
}