Single-Pointer Doubly-Linked Lists

Old computer science books aren’t perceived as being of much use, since everything is so much better now. Of course, that’s not entirely true, especially when we are interested in the techniques used in the days where 32KB core memory was “a lot”. Leafing through one such book, Standish’s 1980 Data Structure Techniques, I found a method of maintaining doubly-linked lists using only one pointer. Let’s see how it works!

chain

The idea underlying the trick is that the exclusive or, denoted \oplus here, is a reversible operation. We have, in particular, that

(a \oplus b)\oplus b=a

and

(a \oplus b)\oplus a=b.

Using \oplus, we can conflapulate two pointers together and recover the other if we know one of the two pointers. So, you understood, it requires the knowledge of at least one pointer. To ensure that we always know at least one of the two pointers, we will use the known pointer to the previous node to decode the pointer to the next node. The list is structured, in fact, as:

xor-lists-after-standish

Starting at the start of the list, we have our first known pointer. We also that there is no node before the first, and so the previous pointer is nullptr, 0. The first node contains 0 \oplus a_2=a_2, so it’s easy to get the next pointer. We keep a_1 in mind; as we move to node a_2, we can find a_1 \oplus (a_1 \oplus a_3)=a_3, the next node.

*
* *

Here’s my rough proof of concept code:

#include <cstdint>
#include <iostream>

template <typename T>
class xor_node
 {
  public:

  xor_node<T> * ptr;

  T data;

  xor_node * xor_with(xor_node<T> * p)
   {
    ptr=reinterpret_cast<xor_node*>(
         reinterpret_cast<std::uintptr_t>(p)
         ^ reinterpret_cast<std::uintptr_t>(ptr));
   }
  
  xor_node * decode(xor_node<T> * prev) const
   {
    return reinterpret_cast<xor_node*>(
     reinterpret_cast<std::uintptr_t>(prev)
     ^ reinterpret_cast<std::uintptr_t>(ptr));
   }

  xor_node()
   : ptr(nullptr),data(0)
  {}

  xor_node(xor_node<T> * last, const T & thingie)
   : ptr(last), data(thingie)
  {}
};

template <typename T>
class xor_list
 {
  private:

  // there two are "true" pointers
  xor_node<T> * head;
  xor_node<T> * tail;

  public:

  xor_node<T> * first() const { return head; }
  xor_node<T> * last() const { return tail; }
  xor_node<T> * next(xor_node<T> * last,xor_node<T> * current) const
   {
    return current->decode(last);
   }

  void append(const T & thingy)
   {
    if (head)
     {
      xor_node<T> * p=new xor_node<T>(tail,thingy);

      tail->xor_with(p);
      tail=p;
     }
    else
     head=tail=new xor_node<T>(nullptr,thingy);
   }

  xor_list() : head(nullptr),tail(nullptr) {}
 };


int main()
 {
  xor_list<int> list;

  list.append(3);
  list.append(4);
  list.append(1);
  list.append(6);

  xor_node<int> * previous = nullptr;
  xor_node<int> * current = list.first();

  while (current)
   {
    std::cout << current->data << std::endl;
    xor_node<int> * t=list.next(previous,current);
    previous=current;
    current=t;
   }
  
  return 0;
 }

In the code above, the only “hard” part is the xor-ing of the pointers. You can’t just xor together two pointers, you have to convert them to integer before. However, you must be careful: it doesn’t suffice to convert them to int, as int may or may not be large enough to accommodate pointers. For example, on 64-bits Linux, int is still 32 bits, but pointers are 64-bits—it’s the LP64 model. To make sure the code is safe, you must use cstdint‘s std::uintptr_t with a reinterpret_cast (which doesn’t do copies, just reference-casting).

*
* *

The method might be useful where bidirectional enumeration is needed, and memory is scant. But it is also cumbersome. Iterators must know about current and previous pointers (or, if we’re doing a reverse traversal, the previous pointer in the other direction, so the “next” pointer). Insertion elsewhere than at the head or the tail of the list is also a bit more complex as you must know previous, current, and next, and recompute all the xors to add a new node. Traditional, explicit pointers lists would only ask for pointers overwrites.

Yet, we must admit that if cumbersome, the trick is clever. And potentially useful, even if we have plenty of memory. Since 64-bits pointers take 8 bytes each, we may end up devoting a large part of the memory to pointers rather than data.

(See also my post about “Short pointers“).

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: