Enumerating Enums

Every programming language has its defects; sometimes large, sometimes subtle; often causing irks. One thing that has been bugging me for a while (and many others, as you can see if you use Google) is the impossibility of enumerating enums from basic language constructs. You just can’t. Using the STL (and C++ 2011), however, you can have a somewhat elegant solution.

The solution I propose in this post is robust as it does not depend on the actual values of enums nor on that they can be mapped onto integers. We arrived at this solution (we being me and people on the ##c++ channel on Freenode, especially user “moonchild”) and I think it (almost) solves the problem of enumerating enums.

A typical enum looks something like this:

enum { rare, medium, well_done };

The compiler assigns unique value to each of the items (the first usually (and without guaranty) being zero, the next being 1, etc.), uses a machine-friendly storage (most probably int), and you’re kind of not supposed to worry about their values. Except when you do:

enum { rare=1, medium=2, well_done=4 };

Which would allow you to do things like medium|rare to get a value of 3 (to represent medium-rare!). However, in many cases, you would like to do something like:

typedef enum { rare=1, medium=2, well_done=4 } cooked;

for (cooked how=rare;how<=well_done;how++)
 {
  ...

Which of course does not work if the symbols have been assigned more complex values. Here, in our example, it suffices to change how++ to how<<=1, but what if the enumerated values are pretty much random-looking? Say you want to do something with opcodes in a virtual machine or processor emulator? Say :

typedef enum { nop=0, add=0x12, sub=0x99, mul=0x80, div=0 } instruction;

for (instruction op=nop;op< ??? ; ??? )
 {
  ...

Clearly, relying on enums being vaguely ints doesn’t work. In some sense, enums are sets (especially if you want strong typing) and one would expect to be able to write something like

for (some_enum::type x : some_enum::all_values)
 {
  ...

One possibility is to create a set with all enumerated values:

#include <set>

class some_enum
{
 public:

 typedef enum { a=1,b=2,c=1231,d=123121 } type;

 static const std::set<type> all_values;
};

And have, in a separate .cpp file:

#include <some-enum>
#include <set>

const std::set<some_enum::type> some_enum::all_values
 {some_enum::a,some_enum::b,some_enum::c,some_enum::d};

And now you can write:

  for (some_enum::type x : some_enum::all_values_as_a_set)
   std::cout << x << std::endl;

It will produce the expected:

1
2
1231
123121

It also allows us to test if a value is an acceptable enum-value. (But not as elegantly as ye olde Pascal in statement.)

The solution we arrived at, however, only concerned with enumerating the value is as follows:

#include <initializer_list>

class some_enum
{
 public:

 typedef enum { a=1,b=2,c=1231,d=123121 } type;

 static constexpr 
   std::initializer_list<type>
 all_values{a,b,c,d};
};

and, in a .cpp file elsewhere:

#include <initializer_list>
#include <some-enum>

constexpr std::initializer_list<some_enum::type> some_enum::all_values;

This solution allows you to use the exact same code for enumeration.

*
* *

Which one is faster? The (unsafe, partial) solution based on integers is very fast, as least as fast as an ordinary for loop; the solution based on sets or initializer-lists uses iterators, generate quite a bit more code (you can find out using objdump -D -Mintel program), but both general solutions are robust to the enum values and do not rely as much on the hypothesis that enums are ints.

But what if iterating over an initializer list or a set causes a performance problem? Well, you can always revert to “enum-as-int” and make sure that your code is safe, providing everywhere support for undefined or unexpected values. I would guess that wrapping the enum in a class that knows how to do ++ will perform at best as efficiently as the set or initializer list solutions, and will cause other problems (because the more special cases you have, the more complicated the code is, and harder it is to maintain, and the easier it is to forget something or to break the code).

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: