Sometimes you have to encode reversibly two (or more) values onto a single one. The typical example of a pairing function that encodes two non-negative integers onto a single non-negative integer (therefore a function ) is the Cantor function, instrumental to the demonstration that, for example, the rational can be mapped onto the integers.
In a more pragmatic way, it may be necessary to encode two values onto one for data compression purposes, or to otherwise exploit a protocol that does not allow many distinct values to be sent separately. The simplest way of pairing two integer is to interleave their bits, for example, with something like:
uint64_t pair(uint32_t x, uint32_t y) { int64_t p=0; int i=0; while (x||y) { p|= ((uint64_t)(x&1)<<i); x>>=1; p|= ((uint64_t)(y&1)<<(i+1)); y>>=1; i+=2; } return p; }
with its inverse
void depair(uint64_t p, uint32_t & x, uint32_t & y) { x=0; y=0; int i=0; while (p) { x|=((uint32_t)(p&1)<<i); p>>=1; y|=((uint32_t)(p&1)<<i); p>>=1; i++; } }
This implementation, of course, works only with machine-size integers. The idea of interleaving bits (or digits) for a pairing function isn’t new—I remember reading about its invention in the 19th century, but I can’t remember the name of the mathematician. Other kind of pairing functions were proposed, and one that is particularly elegant—and impractical—was proposed by Gödel as a way of encoding programs. Known as Gödel Numbering, the scheme use the fundamental theorem of arithmetic to encode strings onto a unique number. That is, for a string , the Gödel number is given by:
Where is the
th prime number. Since it is uniquely factorizable—it’s easy to divide the number per two until we get an odd remainder to find
—the code is reversible.
Gödel solved the general case of encoding -tuples, but let us return to pairing functions, for 2-tuples, that is, plain tuples. We would have
At first, it seems that having a pairing function based on integer factorization isn’t much of a good idea because factorization is hard, but this special case affords an especially efficient algorithm. In fact, we can factor the number in essentially steps.
Think about it before you continue reading.
*
* *
Since the encoded number is very simple, it only has two distinct factors, 2 and 3, the complexity of finding and
is greatly reduced. A naïve approach would successively divide by three, letting you find
, then successively divide by 2, letting you find
, yielding an algorithm essentially in
steps (note that I am leaving the complexity of division out, to keep things simple, but you would have to include it were you to perform a complete analysis of the algorithm.)
But, since we already know the prime factors, just not their multiplicities, we can observe that if , for an integer
, then
. OK, that means that if the remainder of
is zero,
divides
and therefore
, with
possibly zero. This will allow us to find
in
steps.
Because, if , then
, we can start with
, then double
until
, that is,
is too large for it to be a solution. This mean that
lays somewhere between
and
, which gives us a bisection range.
Both the “galloping phase” of successively doubling and the bisection phase are performed in
. Once
is obtained, obtaining
is easy: you can do just the same thing. Or use a fast
routine for large integers. Also performed in
.
Since it’s quite cumbersome to deal with big integers in C or C++ (and that’s somewhat of an understatement), let us propose a Python implementation:
def godel(x,y): return (2**x)*(3**y) def degodel_log(z): x,y=0,0 ## "galloping" phrase lo_y=0 hi_y=1 while (z % 3**hi_y==0): lo_y=hi_y; hi_y*=2 # ok, we know it's somewhere lo_y<=y<hi_y while (lo_y<hi_y): test_y=(hi_y+lo_y+1)/2 if (z % 3**test_y): hi_y=test_y-1 else: lo_y=test_y z/=(3**lo_y) x= int(math.log(z+0.01,2)) # numerical stability issue here return (x,lo_y)
Bonjour, est-ce que tu es certain que le code que tu proposes compile et fonctionne tel que tu le proposes?
Pour le code Python, il faut importer math.
Pour le code C, il faut <stdint.h>, qui donne les int32_t et les autres (C99).
Je n’ai pas compris les boucles exaustives
Ex utilisant 32 bit au lieux de 64 mais la logique est la meme.
The exhaustive loops works by interleaving the bits from one number into the bits of the other. For example: if the first one is abcdef and the second one ghijkl, then the pairing function computes agbhcidjekfl. Just like when you shuffle cards.
Your proposition could also be written:
where almost everything becomes compile-time constants. Or you could use something like an union:
which now does not contain any explicit shift and leaves the compiler to figure out the best code to generate for this ;)
I modified the original to be safer (it now works on the complete intended range). Thanks for making me question that ;)
Enfin, je ne comprend toujours pas l’aventage d’avoir les bit
de deux entier s’entre croiser.
Pour ce qui est d’utiliser des unions je ne suis pas pour,
j’opterais beaucoup plus pour une forme utilisant une logique plus solide du type.
Je n’apprecie pas vraiment les tentatives d’optimisation qui rendre le code source plus court et moin lisible.
Bien heureux ceux qui l’adopte et qui ne redebug plus jamais leur code :)
void pair2( void *high,void *low,void *retval,short intsize)
{
if( high == NULL ||
low == NULL ||
retval == NULL)
{
//zerror…
return;
}
switch(intsize)
{
case sizeof(u_int8_t):
//etc,,,,,,
break;
case sizeof(u_int16_t):
//etc,,,,,,
break;
case sizeof(u_int32_t):
//etc,,,,,,
break;
case sizeof(u_int64_t):
//etc,,,,,,
break;
default:
//zerror….
return;
break;
}
return;
}
[…] https://hbfs.wordpress.com/2011/09/27/pairing-functions/ […]