A positional number system needs a base that is either greater than one, or smaller than minus one—yes, we can have a negative base for a number system. The system, however, seems to break down if the base we chose is base 1.

If the base is 1, then there are no permissible digits since the digits , in a base system, must be . But we can still represent numbers using just 1s. That's the unary numeral system, and numbers are just represented as repeated 1s. 15? Fifteen ones: 111111111111111. Operations? Not very complicated, just… laborious.

Indeed:

Addition of two unary numbers is straightforward: you just concatenate the two series of ones.

Subtraction of two unary numbers. Suppose and are two numbers. The operation (supposing ) is just that for each 1 in , you remove a 1 in .

Multiplication of two unary numbers. Again, let and be two numbers. We proceed by iterated addition. Let be zero (no ones, the empty number?). While , concatenate to , remove one 1 from .

Division of two unary numbers. Let and be zero. While , subtract from , add 1 to . will be the quotient, and the remainder.

None of these operations seems particularly complicated, but they are all very tedious. Adding or subtracting 1 from an unary number is merely adding or removing one 1 at the end, but adding two arbitrary numbers means copying a whole lot of ones. Subtracting is also not as straightforward as we might hope. On paper, we could align both numbers and easily see if one is larger than the other and by how much. In the computer, however, we must scan both series of ones (assuming we can’t know lengths otherwise) to figure out exactly how much is left.

* * *

So what’s the use of unary encoding? Well, under certain circumstances, you can use it to encode integers using a variable length code. Well, almost. First, we must notice that in reality, “unary” encoding still uses two symbols. True, there’re the ones, but there’s also the space, the delimiter, that signals the end of the string. On paper, this extra “character” is blank paper, but it does contain information. In the computers’ memory, we’ll use 0 as a delimiter. The number 7 in “unary” becomes 11111110.

That’s actually an efficient code when numbers are distributed exponentially, that is, .

This entry was posted on Tuesday, February 13th, 2018 at 20:39 pm and is filed under data compression, Mathematics. You can follow any responses to this entry through the RSS 2.0 feed.
You can leave a response, or trackback from your own site.

Hi. Great and simple post.
Can you please explain the last line:
That’s actually an efficient code when numbers are distributed exponentially, that is, P(x=n)\propto 2^{-n}.

If , (with ) then you have 50-50% chance that n=1, and the code is 0, if not, you have one of two chances that n=2, and its code is 10, … n=3 has code 110, n=4 has code 1110, etc.

And since Shannon’s theorem states that a code is optimal whenever

is minimized, a code that uses n bits to code number n is optimal if .

Hi. Great and simple post.

Can you please explain the last line:

That’s actually an efficient code when numbers are distributed exponentially, that is, P(x=n)\propto 2^{-n}.

If , (with ) then you have 50-50% chance that n=1, and the code is 0, if not, you have one of two chances that n=2, and its code is 10, … n=3 has code 110, n=4 has code 1110, etc.

And since Shannon’s theorem states that a code is optimal whenever

is minimized, a code that uses n bits to code number n is optimal if .