Extending C++’s using statement

If you’re using C++ as your principal programming language as I do, you certainly know some or most of its capabilities, some of its limitations, and you’re surprised once in a while by a new construct or feature you never thought of even trying in C++.

Having inherited most of its basic behavior from C, C++ still has many quirks and omissions that keeps C++ from becoming a true next-generation language. I’m thinking, as a best example of this, the absence of true arrays. Arrays are pointers to stuff, sometimes you can get the size of the array, most of the times you’re stuck with the size of the pointer, which is of no use and forces the user to manipulate explicitly array meta-data (curiously enough, it wouldn’t ask for much to be able to know the size of an array all the time because new[] and delete[] do hide meta-data to do exactly that).

Other limitations are conceptual or syntactic in nature. The conceptual limitations are made apparent when the compiler spews 45 lines of error for unknown type itn used as a template argument. Other limitations are syntactic in their manifestation while being conceptual: the manipulation of scopes.

There are methods for making explicit namespaces, the principal scoping mechanism, but there’s no real way to manipulate instances of objects as scopes. In Pascal, for example, you could write:

with this_record do
begin
  name:='toaster'; { ok if public! }
  slots:=4;
end;

Which would translate name as this_record.name. While this particular example seems rather pointless (the syntactic sugar is as ugly as the annoyance it is meant to solve), consider now:

with lists^.next[p,k] do
begin
  name:='toaster'; { ok if public! }
  slots:=4;
end;

The statement that describes the instance to manipulate is now rather long, and also rather expensive to compute. There’s a pointer dereference and a possibly expensive index computation. If the compiler is not capable of eliding this computation, you have a rather important performance penalty. But the main gain doesn’t come from saving 10 cycles (or whatever it may be on your target architecture) but from tremendously increased legibility and also greatly reduced chances for errors.

Python as a similar construct (its semantics are slightly different), but in C++, we do not have a similar construct, but I guess it would not be very difficult to extend the syntax in the spirit of the language: we already have the using keyword. Although meant for namespaces (as in using namespace std;—which you should never do, by the way) it could be extended to create a scope with any scope-generating instance. The above Pascal example would become, in C++:

using *this_item
 {
  name="toaster"; // ok if public!
  slots=4;
 }

Note that we can forgo of the do particle; it’s not C++ style anyway. The rules for look-up for using would not be very different from normal look-up. If it is a struct– or class-like object, the first scope to consider is the object itself, but its external scope. That is, name expands to this_item->name and normal look-up rules apply. If name is public, all’s good, otherwise it results in an error. If name is not member of *this_item, then it follows normal look-up: block-nested scopes, class-scope (we could be in a method), current namespace, global namespace. Using using this_int (with this_int being a simple int) would result in an error since an int is not a scope-generating instance.

*
* *

While this can seem a minor performance issue (after all, compilers should be smart enough to detect a repeated expression such as list->next[p][k], or whatever), remember that not all compilers are created equal. We see performance variations between compilers, and one compiler may win over another on a given code, but lose to another on something else. That's, after all, an implementation detail and we can trust the compiler to deal with common sub-expression elimination at the block or function level. It can't hurt to hint the compiler on what it is you're doing.

The important gain is for legibility and correctness of the code. The legibility part is debatable as you may correctly point out that the programmer now has to have know which object owns what field; and that we lose some explicit scoping information with the proposed using syntax. True, but it also eliminates tedious repetitions (which are themselves potential sources of error) and makes clear what object you're manipulating in this block, which both help with correctness.

I can read your mind. You're thinking "C++ is already complex enough, do we need yet another construct?" I think that if you can add something easily, if it is in the spirit of the language, and it can help you write better, more legible, and more correct code, why not?

3 Responses to Extending C++’s using statement

  1. Fredrik Arnerup says:

    Why not just use a reference in a local scope?


    {
    ItemType &item = *this_item
    item.name="toaster"; // ok if public!
    item.slots=4;
    }

    Not much more typing is it? Also, the pascal-with approach is ambiguous; is slots a member of this_record or is it a local variable in the outer scope? You’ll have to check the type of this_record to know. Admittedly, there is a similar ambiguity with local variables and member variables, but I don’t think there is a reason to add another one. Like the pythonistas say, “explicit is better than implicit”.

    • Steven Pigeon says:

      That’s what I do currently. I use this_thingie as either reference or pointer (whichever makes more sense), but you still have this_thingie as a prefix to everthing.

      And yes, I know it is ambiguous (I mention the problem in the post), but to the programmer only: the look-up rules would be the same as usual. A local variable would mask a variable with using scope, then the rest would be as it is already.

  2. Evan Jones says:

    I don’t want to add my opinion about if it is the right choice or not, but there are logical reasons that C++ doesn’t provide you the size of the array. First, it was designed to be compatible with C, where arrays are really just weak syntactic sugar for pointers. Second, for types without destructors, I don’t think delete[] needs the size: There is nothing to be destructed, and some types of memory allocators don’t need to know the exact memory size. Therefore, adding this requirement would impact memory size and/or run time speed.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: