The sorting algorithms we were taught in class were typically simplified versions of the algorithms that assumed that the data remained in memory and in contiguous memory locations, also known as an array. What we often have, however, are linked lists. Generalizing algorithms like the QuickSort to lists is not very hard in fact if we use a modification of the basic algorithm.
For lists with numerical keys (or values), there might be a simpler algorithm that scans the list in only one direction, so that simply linked lists are a perfectly cromulent data structure, that is, the radix sort.
The basic idea of the radix sort is extremely simple and best understood using the playing cards analogy. Say you have a normal deck of 52 cards that you have to sort in the French suite order—diamonds (♦), spades (♠), hearts (♥) and clubs (♣), from ace (1) to King (13). A naïve approach would be so hold the unsorted cards in one hand and insert them one by one in the correct place with the cards being sorted, splayed out on the table.
A counter-intuitive but much more efficient way is to build thirteen stacks of cards, one for the aces, the deuces, up to the kings, and distribute the cards one by one from the unsorted hand onto the stacks. When this first pass is completed, the cards are still unsorted, but all the aces are together, so are the deuces, up to the kings. The second pass is to take the thirteen stacks so that the deuces are under the aces, the threes under the deuces, etc, with the kings completely under. You then form 4 stacks, one for each suit (♦, ♠, ♥, ♣). Each time you draw a card, you send it to the appropriate stack. When you’re done, you just stack the four stacks together, and voilà! You have a sorted deck of cards.
Radix sort uses the same kind of techniques to sort a list of numerical values of known maximum bit width. However, instead of using suits and numerical card value, you would repeat a similar process with parts of the numerical key bits long,
bits at a time.
Using bits, you necessarily need
stacks. You will repeat the process
times to form a completely sorted list.
For each key in the list, at iteration , you take the
th bunch of
bits from the key and send it to the corresponding stack in the
stacks, the stack whose number is given by those
bits considered as an integer. At iteration
, for a given key, the stack would be given something like (key >> (k*i)) & mask_k. the mask ensures that only the
least significant bits are kept; mask_k=(1<<k)-1,
1 bits.
At the end of an iteration, you get stacks which you must re-stack into only one list. A naïve approach, say using STL containers, would ask for a lot of copies, allocation and de-allocation of list nodes. However, if you have an implementation where everything is accessible, you may notice that 1) even if you merge the stacks, the orders of items do not change, and 2) since it’s a linked list, why not link the top of the previous stack to the bottom of the next one? These two observations let us “sew” the stacks together as a single list, and the cost of the operation is not dependent on the number of items, but only in the number of stacks!
The two next figures illustrates the stacks and the stack “re-sewn”
The algorithmic complexity of the radix sort depends on the numbers of items, and on the parameters
and
. The number of iterations,
does not depend on
at all, and each iteration costs
operations since for each item, it suffice to compute the stack number and send it to that stack, and both operations can be computed in
. The total complexity is therefore of
, which is linear in
. For
bits key values, Radix sort is therefore faster than QuickSort which is
.
Another interesting property of radix sort is that the lists are always scanned in the same direction, allowing simply linked lists (one with only a pointer to the next element). It may also be quite amenable to implementations where the data is laid out on many disks, as it was common in the olden days… and maybe quite necessary with monster data sets used in data mining applications.
*
* *
A bare-bone C99 example implementation would look something like that:
#include <stdlib.h>
#include <stdio.h>
#include <time.h>
// C99 Program (not ANSI C!)
//
// gcc -std=c99 -Wall -Wextra -O3
const int nb_numbers=1000; // any large number
////////////////////////////////////////
//
// a list node
//
struct list_node {
int number;
struct list_node * next;
};
////////////////////////////////////////
//
// structure that holds
// the head and tail of
// a list
//
struct list {
struct list_node * head;
struct list_node * tail;
};
////////////////////////////////////////
//
// quite minimalistic append
//
void append( struct list * this_list, int number )
{
struct list_node * new_node = malloc( sizeof (struct list_node) );
new_node->number=number;
new_node->next=NULL;
if (this_list->tail)
{
this_list->tail->next = new_node;
this_list->tail = new_node;
}
else
this_list->head=this_list->tail=new_node;
}
////////////////////////////////////////
//
// appends a node to a list
// and unlinks it
void append_and_unlink_node( struct list * this_list,
struct list_node * this_node )
{
if (this_list->tail)
{
this_list->tail->next=this_node;
this_list->tail=this_node;
}
else
this_list->head=this_list->tail=this_node;
this_node->next=NULL; // unlinks
}
////////////////////////////////////////
//
// shows the content of a list
//
void show_list(struct list * this_list )
{
struct list_node * current = this_list->head;
int sorted=1;
int last=0;
while (current)
{
printf(" %d",current->number);
if (last > current->number) sorted=0;
last=current->number;
current=current->next;
}
printf("\n");
if (sorted)
printf("sorted.\n");
}
////////////////////////////////////////
//
// the radix sort on simply-linked
// list
//
//
void sort_list(struct list * this_list,
int base // in bits
)
{
int i,s;
const int nb_buckets=1<<base;
const unsigned mask=nb_buckets-1;
const int iterations=(sizeof(int)*8+base-1)/base; // with rounding!
struct list buckets[nb_buckets];
// exits if the list is empty
//
if (this_list->head==NULL)
return;
// do the iterations
//
for (i=0,s=0;i<iterations; i++,s+=base)
{
int b,c;
struct list_node * current = this_list->head;
// clears the buckets
for (b=0;b<nb_buckets;b++)
buckets[b].head=buckets[b].tail=NULL;
// scans the original list's items
// one by one and transfer them to
// buckets
//
while (current)
{
struct list_node * next = current->next;
int num = current->number;
int bucket = (num >> s) & mask;
append_and_unlink_node(&buckets[bucket],current);
// necessary because here
// the node is unlinked from
// the main list
//
current=next;
}
// sews the buckets back together
// while checking for empty buckets
//
for (c=0,b=0;b<nb_buckets;b++)
if (buckets[b].head)
{
if (c)
// this is not the first (non empty)
// bucket
{
this_list->tail->next=buckets[b].head;
this_list->tail=buckets[b].tail;
}
else
// first non-empty bucket
//
{
this_list->head=buckets[b].head;
this_list->tail=buckets[b].tail;
c=1;
}
}
}
// buckets pointers can be just
// dropped (no deallocation needed)
//
}
int main()
{
int i;
struct list this_list={NULL,NULL};
// fills the list with
// random numbers
//
srand(time(NULL));
for (i=0;i<nb_numbers;i++)
append( &this_list, rand());
show_list(&this_list);
sort_list(&this_list,4);
show_list(&this_list);
return 0;
}
This code is released under the BSD license.
*
* *
RadixSort beats QuickSort, complexity-wise, whenever .
is usually the number of bits of the machine-sized integer, something like 32 or 64, but ultimately depends on the key. Maybe the true value of
is something like 24, or 19. Knowing
, one must solve for
in order to have a sort that is faster than Quicksort. We know that
grows very slowly in
so we must make sure that
is chosen properly. Using basic algebra, we find that
, so
can be chosen adaptively depending on both
and
.
If and
, the radix sort will behave more or less as a QuickSort, complexity-wise, yielding
complexity. However, it would not suffer QuickSort’s
worst case complexity: it remains exactly always
.

