Phimuemue, in a recent post (at the time of writing, anyway) present his variation on sorting floating point values using radix sort. His implementation wasn’t dealing with the pesky sign bit so I offered a slight modification to his algorithm as a comment on his blog. But for some reason, he did not allow to post it.

So I’ll present my solution here.

So this is the verbatim text I sent as a comment:

I am a big fan of Radix Sort myself. Using on float always have bugged me because it’s rather hard to get an implementation that is completely safe. The C standard doesn’t enforce floats to be IEEE754 in the first place, although I know of no compiler nor modern CPU that aren’t IEEE754 (and those who aren’t probably use integer-like fixed point, so it should still work correctly). To make things even more troublesome, the standard doesn’t guaranty at all that sizeof(int)==sizeof(float). Int should be replaced by int32_t or uint32_t from the C99 standard header

<stdint.h>, after testing the actual size of floats (although I’m not sure how to do that cleanly).Dealing with negatives is not that hard, since the array would be mostly sorted (small to big positive numbers, then large to small negative numbers… something like 1 2 3 -1 -2 -3). If we have negative numbers, we just swap the top and bottom part of the array, and we’re done

^{*}.My code looks like:

// 32-bits version! void radix_sort_3(float array[], size_t count) { int zeroes=0; float temp_array[count]; // c99 only float * warray = temp_array; for (uint32_t radix=1;radix;radix<<=1) { uint32_t * iarray = (uint32_t *)array; int count0=0; int count1=0; zeroes=0; for (int j=0; j<count; ++j) zeroes += !(iarray[j]&radix); count1=zeroes; for (int j=0; j < count; ++j) if (iarray[j]&radix) { warray[count1]=array[j]; ++count1; } else { warray[count0]=array[j]; ++count0; } // we won't copy that // each time! swap((float**)&warray,&array); } // here 'array' is restored to // its original value. // are there negatives? // if (zeroes<count) { // oh noes! we must swap parts // before leaving! memcpy( warray+(count-zeroes), array, zeroes*sizeof(float)); for (int d=0,j=count-1;j>=zeroes;j--,d++) warray[d]=array[j]; memcpy( array, warray, count * sizeof(float)); } }I changed the copy in each iteration by a swap-buffers-by-pointers operation. My version is a bit more type-safe, but not completely. In particular, it’s not endian-safe. If integers and float have different endians on the target machine (I don’t think there’s such a machine, but then again…) the code won’t work.

^{*} that’s not an entirely accurate description of what the algorithm does. In fact, it does swap top and bottom parts of the array, but does a *reverse* copy in the lower part, ensuring we list numbers from smallest to largest in all cases.

My solution isn’t that bad, actually. At least it prevents the toggling of the sign bit (and can been seen as a variation on this solution, which I discovered a bit later).

*

* *

The problem with *all* solutions is that they suppose IEEE 754 floating points, which neither of the C nor C++ standards guaranty. You may or mayn’t have IEEE 754 floats. So, in theory, this poses a portability problem. In practice, I can’t think of a non-legacy system where the floating point numbers for C and C++ weren’t in IEEE 754.

It still however poses the problem of endians. Endianness determines how larger data types are stored in memory. For example, in a *small endian* system, the least significant bytes of a multi-byte integer are found at the lower addresses, while on a *big endian* system, the most significant byte is stored first, at the lower address. This means that the bits you’re looking for may not exactly be were you think it is. Endianness is indirectly taken care of if `int` and `float` have the same storage order—both big or both small endian—but that’s not always the case. For example, on a MIPS R10000 CPU (which was in SGI’s Oxygen workstations), floats were stored in “normal order” and integers were stored in big endian order …if I remember correctly.

*

* *

Of course, I am kind of familiar with sorting in general, and I’ve done a number of things with radix sort before. I’ve done a paper on this in Dr. Dobb’s Journal quite a while ago and revisited the results and techniques recently here and here

*

* *

Radish clipart curtesy of www.DailyClipart.net.

GNU sort does a radix sort on floats in numeric mode

http://git.savannah.gnu.org/gitweb/?p=coreutils.git;a=blob;f=lib/strnumcmp-in.h;hb=HEAD

Hi Steven,

nice article. Excuse that I didn’t approve your comment. I get lots of spam and I apparently removed your comment by mistake.

phimuemue

Hi,

First of all I would like to thank you for this article – it’s very usefull for me:)

However, I would like to ask you about toggling the sign bit in float values.

I’m of course aware that when you toggle the bit, you change the value. Assume, however, that values are irrelevant. So my question is: Which approach gives more performance boost: your approach with swap-buffers-by-pointers or the toggle-bit-approach?

Thanks for answer and once again thanks for great article!:)

Wouldn’t you have to remember which values were orginally negative? (you’d have to.) How do you propose to reorder while keeping the sign bit somewhere else? Would it not result in -5 and 5 being sorted side by side in the final array?

Surely it would be even more efficient to do a few ALU ops on the floating point number to invert the negatives as well as the sign bit to maintain continuous values without rearrangement?

For example:

uintValue ^= (1+~(uintValue >> 31) | 0x80000000);

and to return:

uintValue ^= (((uintValue >> 31) – 1) | 0x80000000);

ie you always flip the sign bit, and if it’s negative you invert everything by bit-extending the sign bit to the entire 32-bits.

During the first stage of your sort you invert and on the output you flip back. For large sorts I would expect this to be faster than performing another pass through memory, but I could be wrong.

[…] čísla mají bitovou reprezentaci podle specifikace IEEE754. Nejjednodušší možností je data seřadit jako integery a pak opravit pořadí záporných hodnot. O něco rafinovanější je floaty na vstupu transformovat, tak aby by je bylo možné řadit […]

[…] implementace řadící osm bitů v jedné iteraci potřebuje 5 iterací pro 32bit klíče (první spočítá frekvence […]