Sometimes, you have a small bit of data, may something like a GUID (for which there are many possible solutions), that you may have to store in a plain-text file, nothing crucial, not sensitive, but that you don’t really want your users to poke with, even if they really mean to. In such cases, you could use encryption, but it may be that *mild obfuscation* is quite sufficient and dissuasive.

So, if you don’t really want strong encryption, what can you do to provide a machine-efficient encryptionnette?

OK, let me repeat this one more time: the premise is that we do not need top-notch encryption, just a way to keep the user from messing with a value stored in a plain-text file. Of course, if you’re doing open-source, he will be able to do so more easily, but the point is that you want to make sure that it’s kind of not really worth the trouble to mess with that value.

So let’s say that your GUID is built from the machine’s first MAC address and a strong check-sum, something like a CRC, yielding 64 bits strings: 16 for the check-sum, and 48 for the MAC Address.

So what are the easily invertible operations one can apply to a 64-bits number to yield an (apparently) uncorrelated number?

We can try to multiply by a constant such that

.

In other words, find such it is its own inverse modulo . Typically, for machine-efficiency reasons, but can be any value larger than 1. For some , only the only solutions are , for others, you’ll have a couple more solutions. If , then is a solution. For , `0x7fffffffffffffff` and `0x8000000000000001` are solutions. Interestingly enough, these multipliers seem to randomize the input quite nicely. For example, using and , we get:

3d1b58ba |
c2e4a746 |

507ed7ab |
2f812855 |

2eb141f2 |
d14ebe0e |

41b71efb |
3e48e105 |

79e2a9e3 |
061d561d |

7545e146 |
8aba1eba |

515f007c |
aea0ff84 |

5bd062c2 |
a42f9d3e |

12200854 |
eddff7ac |

4db127f8 |
b24ed808 |

…which looks randomizing enough. The inverse is to multiply by again, which makes it a symmetric operation. The integer multiplication is not that expensive on a modern processor, but it can be quite involved for a micro-controller class CPU (even if it’s capable of 32-bits arithmetic).

*

* *

We can xor our data with a fixed random mask. Pick any 64-bits number using a pseudo-random number generator), not necessarily a strong one. Since the mask is fixed, there’s no point of having a very strong random number; it just has to be random enough so that when you xor it with your data, it looks more random than it used to. Of course, this is clearly not impervious to a known plain-text attack, it suffice to send a zeroed vector to retrieve the key.

*

* *

We can use a bit blender. A bit blender, at least how I defined it in a previous post, is an operation that move bits around in a lossless fashion—that is, the operation is *reversible*. That is, it applies an arbitrary permutation of the input bits: it can reverse the vector, apply a rotation, move blocks around, etc. For the application we are considering here, a blender that disperses the bits randomly (but reversibly) is preferable. One simple operation that does that rather well is the *perfect shuffle*—an operation known to card-game players as the Faro Shuffle.

The following diagram shows the shallowest network that produces the perfect shuffle for 8 elements:

A simple Python implementation of the perfect shuffle would look like:

def perfect_shuffle(a): """perfectly shuffles the supplied list, throws AssertionError if list has odd length.""" l=len(a) if l % 2: raise AssertionError else: h=l/2; # half-length # variant proposed by Kirk McDonald # (from the #python channel on Freenode) return [item for pair in zip(a[:h],a[h:]) for item in pair] def perfect_unshuffle(a): """perfectly unshuffles the supplied list, if previously perfectly shuffled.""" return a[::2]+a[1::2]

It will take a list `[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]` and yield `[0, 5, 1, 6, 2, 7, 3, 8, 4, 9]`. Note that the shuffled list contains all of the original list’s items. The transformation is also reversible, and we can recover `[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]` from `[0, 5, 1, 6, 2, 7, 3, 8, 4, 9]`.

A C/C++ implementation is a bit more involved. Well, a whole lot.

uint32_t perfect_shuffle(uint32_t x) { x=(x & UINT32_C(0xff0000ff)) | ((x & UINT32_C(0x00ff0000)) >> 8) | ((x & UINT32_C(0x0000ff00)) << 8); x=(x & UINT32_C(0xf00ff00f)) | ((x & UINT32_C(0x0f000f00)) >> 4) | ((x & UINT32_C(0x00f000f0)) << 4); x=(x & UINT32_C(0xc3c3c3c3)) | ((x & UINT32_C(0x30303030)) >> 2) | ((x & UINT32_C(0x0c0c0c0c)) << 2); x=(x & UINT32_C(0x99999999)) | ((x & UINT32_C(0x44444444)) >> 1) | ((x & UINT32_C(0x22222222)) << 1); return x; } uint32_t perfect_unshuffle(uint32_t x) { x=(x & UINT32_C(0x99999999)) | ((x & UINT32_C(0x44444444)) >> 1) | ((x & UINT32_C(0x22222222)) << 1); x=(x & UINT32_C(0xc3c3c3c3)) | ((x & UINT32_C(0x30303030)) >> 2) | ((x & UINT32_C(0x0c0c0c0c)) << 2); x=(x & UINT32_C(0xf00ff00f)) | ((x & UINT32_C(0x0f000f00)) >> 4) | ((x & UINT32_C(0x00f000f0)) << 4); x=(x & UINT32_C(0xff0000ff)) | ((x & UINT32_C(0x00ff0000)) >> 8) | ((x & UINT32_C(0x0000ff00)) << 8 ); return x; } uint64_t perfect_shuffle(uint64_t x) { uint16_t *y=(uint16_t*)&x; std::swap(y[1],y[2]); return ((uint64_t)perfect_shuffle((uint32_t)(x >> 32)) << 32) | perfect_shuffle((uint32_t)x); } uint64_t perfect_unshuffle(uint64_t x) { x=((uint64_t)perfect_unshuffle((uint32_t)(x >> 32)) << 32) | perfect_unshuffle((uint32_t)x); uint16_t *y=(uint16_t*)&x; std::swap(y[1],y[2]); return x; }

How this works is explained here in great detail.

The point is that the perfect shuffle, like when you’re shuffling cards, disperses bits around quite a bit. But reversibly so.

*

* *

So what if we were to combine all these operations? that would be the best order? Since they are each reversible, any order could be applied: xor-mul-shuffle, or mul-shuffle-xor, or mul-xor-shuffle, or…

The mul-xor-shuffle test code would look something like:

// reversibility test std::cout << std::hex; const uint64_t mult=UINT64_C(0x7fffffffffffffff); uint64_t mask=((uint64_t)std::rand() << 32) | std::rand(); std::cout << "mask=" << mask << std::endl; for (int i=0; i<10000000; i++) { uint64_t u=((uint64_t)std::rand() << 32) | std::rand(); // sure! uint64_t obfuscated_u=perfect_shuffle((u*mult)^mask); uint64_t unobfuscated_u=(perfect_unshuffle(obfuscated_u)^mask)*mult; if (unobfuscated_u != u) { std::cout << "oh noes!!!" << u << " " << unobfuscated_u << std::endl; return 1; } }

With a `mask` of `1888600f3e744fe9` (generated randomly), we get:

obfuscated | |

3579ad956a6116a |
e9350c5144231dd7 |

59400ebf61afd9f7 |
4ea80e3a964374aa |

606c6ffe5916d626 |
c16a43dbbe1405ad |

4094d7a77661edc0 |
cd3ffc4e31d1666b |

6440029e149e048d |
511b0b3bc7b269ec |

72fc9db1258a1eb7 |
5632808b445c6402 |

7d1618fa4b6b5ec9 |
42cc7c02d47e51dc |

171a50cd17b3e4ca |
fb142de2b1ba5bf7 |

00a8dfc459c13348 |
e96ab2ee60051b21 |

27686abb39843dd0 |
f54002ffea7371cb |

OK, listing a list of numbers that kind of looks like random is a proof by volume, but you can experiment by yourselves.

*

* *

I am sure you have thought, reading this entry, of many more ways of mildly obfuscating a n-bit word, and that’s fine. The main point is the operation must be *randomizing* but *reversible*. If it’s not reversible, it’s merely a hash, and that’s not what we wanted.

Can you think of other reversible randomizing operations?

[…] Mild Obfuscation, I […]

Hello Steven,

your first table (x -> a*x mod m) seems to be using m=2^32, contrary to your statement that m=2^32-1.

Great post!

You’re absolutely right! (I was thinking

& 0xffffffff). I will correct this.[…] discussed moving bits around before. The first function (that I called “blender”) moves bits around following this […]