Python References vs C and C++

As I’ve mentioned before, my new job will ask me to program more in Python than C++, and that’s some what new for me. Of course, I’ve criticized Python’s dismal performance on two occasions, getting me all kind of comments (from “you can’t compare performance like that!” to “use another language then” passing by “bind to external high-performance libraries”).

But it seems that my mastery of Python is still quite inadequate, and yesterday (at the time of writing, anyway) I discovered how Python’s by-reference parameters work. Unlike C or C++ that use explicit syntax to specify what kind of object we’re dealing with (either by value, pointer, or by reference), Python is a bit sneaky.

Let us start with a simple C program:

void function_something( sometype v )

In the above code snippet, whether sometype is an atomic type such as int or a complex type such as a struct, the variable v holds a (shallow) copy of the parameters and can be manipulated with the semantics of a local variable. This is a “by value” argument passing.

Now let us write:

void function_something( sometype * v )

This snippet also makes perfectly clear what is going on. In the first assignment, v->some_field=0, it is unambiguous that what is pointed to by v is affected by the operation. In the second case, we just assign the local variable holding the pointer, which does not affect what v was originally pointing to.

In C++, pointers are often replaced advantageously by references, which are also basically pointers (that’s the way it’s implemented internally) but with a few subtle differences. The first is that a reference must be initialized at create/define time, so you can’t have a non-initialized reference (although you can initialize them with 0, C++’s NULL (until nullptr is ubiquitous)).

The previous snippets can be rewritten:

void function_something( sometype & v )
  v=0; // error! (or not, depends on sometype)

And there’s basically no way of changing the content of v itself, that is, the pointer holding the reference, without resorting to some rather unclean code (and undefined behavior?).

Python references are more finicky. First, it depends on whether an object mutable or immutable, and to determine which is which seems to puzzle more than just me (more on mutable vs immutable on youtube). Let’s say, for the sake of argument, that the parameter is a list. In Python:

def function( this_list ):

The first line inside the function cause some_item to be appended to this_list, just as with C++. The parameter this_list is a reference to another object, so modifications applied through the reference affect the original object, as expected.

Now for the second one, things are different. It is not the content of this_list (or what this_list refers to) but the reference itself that is modified! Basically, this_list=new_list just makes the reference this_list, a local variable, point to some other variable, also possibly local, effectively unbinding the reference and just performing a function-local operation: this_list will not modify the original object after.

This was somewhat puzzling, and I asked people on the #python channel on Freenode, and, as usual, they suggested me the most pythonèsque way of doing a C++-style assignation. One user proposed the following anti-pattern:

def function( this_list ):

While I don’t really see why it’s an anti-pattern, it did the trick splendidly. The splice [:] allows you to assign what is being referred to instead than just changing the reference.

* *

Learning a new language, even when strangely close to the paradigms you’re used to, will inevitably cause you some surprises and the adaptation may or may not be easy. It’s not that I’m just learning Python (I’ve done some fairly complex things with it before) it is just that I am trying to write proper Python code, not just C++ in Python.

11 Responses to Python References vs C and C++

  1. Alejandro says:

    I think in Python is better not to think in terms of “by value” or “by reference”. This explanation by David Goodger, where he says “Other languages have “variables”, Python has “names” ” helps me to understand this better:

    • Steven Pigeon says:

      The name analogy is good, but that doesn’t make justice to Python behavior. For example, you have no problem doing something like:

      def foo(x):
          def bar():
            return x+z
      return bar()

      if z is a “complex object”, but good luck if z is only an integer, say, z=3 insteaf of z=[] in the above. Apparently in Python 3 you have the nonlocal keyword to ease look-ups like that, but I use Python 2.6 and/or 2.7.

  2. Joe says:

    I am not sure where you got into from a practical perspective, but like Alejandro I find myself basically never having to think about this. While in C++ you often pass things by reference so the function can modify them, in Python you just return more objects instead. Something like:

    def smapMe(a,b): return b,a
    [obja objb] = swapMe(obja, objb)

    • Steven Pigeon says:

      On the other hand, if you have a recursive function, it may or may not be convenient to just return the computed value everywhere. Not only is not very intuitive (at least, for a non-pythonista like me) it also incurs performance penalty to copy (even shallowy) a list or some object every time you exit the function.

      • Tom Plunket says:

        Returning a list does not copy anything, it returns a reference to which a name is (presumably) assigned at the call site. In this way, returning objects is superior to languages like C because the default behavior is (at least generally) the high performance behavior.

  3. Juan Juarez says:

    That’s not a ‘list comprehension’, it’s a slice.

  4. Andy Till says:

    It seems the same as in Java, the best explanation I found was that the pointers to objects (references) are copied but the objects are not.

  5. Craig says:

    “Basically, this_list=new_list just makes the reference this_list, a local variable, point to some other variable, also possibly local, effectively unbinding the reference and just performing a function-local operation: this_list will not modify the original object after.”

    Why would you expect to be able to use “this_list” to modify a variable after you tell that variable to point to a different object? I’m not trying to be crass, I genuinely don’t understand (I’m almost exclusively an interpreted language user).

    • Steven Pigeon says:

      Primarily because I expected the this_list to behave as a C-style pointer. Assigning a local copy of a pointer doesn’t assign the original pointer.

      It’s pretty much the reverse for me: I almost always use compiled (and fairly low-level) languages, and I try to figure out the things the language does. In this case, it confirms that it’s a local copy of the reference, not the reference itself.

  6. betqil says:

    Python works exactly in the same way in cases of lists example as would C++. in C++ references are const pointers, so once initialized you can change the content but not what it refers to. So this_list = new List is not allowed. This gives us only T* p case to compare with. When function expects T* p, what is copied is T* but not T and this is what we mean passing by reference. So, any attempt to modify p inside the function will not be visible outside the function. So, p= new T[some_value] does not modify the original pointer which was provided to the function. but if one wona to achieve this then in function arguments we should specify T** p or T* &p.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: