Minimizing boolean expressions is of great pragmatic importance. In hardware, it is used to reduce the number of transistors in microprocessors. In software, it is used to speed up computations. If you have a complex expression you want to minimize and look up a textbook on discrete mathematics, you will usually find a list of boolean laws and identities that you are supposed to use to minimize functions.

Textbooks usually list the following boolean laws and identities:

- They list and explain basic operators: and , or , exclusive or , negation , implication , true , false , etc.
- Double negation:
- Identity laws: ,
- Idempotent laws: ,
- Dominance laws: ,
- Commutative laws: ,
- Associative laws: ,
- Distributive laws: ,
- De Morgan’s laws: , .
- Sub-expressions elimination. You can rewrite as , , and finally .
- And other identities you will discover.

Using these laws, you are supposed to massage your expression until you discover a shorter one that computes the same truth values as the original expression. Or you can use a truth table and see what expressions are equivalent to some of the sub-expressions and how you can combine them to obtain an equivalent but shorter boolean expression. All you need is a pen, a piece of paper, coffee, and a lot of patience. While I make this sound like a tedious process, you will rapidly get the knack of this and you will find that minimizing expressions of a few variable quite easy.

Let us take an example. You may happen to write something like (I do, once in a while):

if ( current->next==0 || (current->next!=0 && SomeThing() ) { ... } else ...

The conditional statement is equivalent to , were is `current->next!=0`, and is `SomeThing()`. Using the truth table approach (a favorite of textbooks), you write:

0 | 0 | 1 | 0 | 1 | 1 |

0 | 1 | 1 | 0 | 1 | 1 |

1 | 0 | 0 | 0 | 0 | 0 |

1 | 1 | 0 | 1 | 1 | 1 |

The last column you eventually find by trying numerous combinations (ok, in this case it’s not all that difficult, but you get the point) to find that . So you rewrite the above code as:

if ( current->next==0 || SomeThing() ) { ... } else ...

The textbook approach to minimization, however, doesn’t tell the whole story: it considers that all operators, , , , , etc. have an equal cost; namely zero. However, in real life, evaluating a boolean expression does incur a time cost. First, each variable involved in the expression must be assigned a truth value, which may be a constant but may also be the result of a computationally expensive function. Second, the expression itself must be evaluated, and the cost of the evaluation is not null. It is proportional to the number of operators involved.

Using boolean simplification and common sub-expression expression will help a lot to reduce the computational cost of the evaluation by not evaluating costly sub-expression more than once, but we can go further. Let us note that it is not always necessary to evaluate the *entire* expression to ascertain its truth value. For example, the expression is true if is true, regardless of the value of , so it is safe to abort the evaluation of the expression whenever is true. If is false, then we proceed to the evaluation of . Similarly, the expression is false whenever is false, regardless of the value of . The sub-expression is evaluated if, and only if, is true.

Some (most?) programming languages makes use of this observation to speedup evaluation of the conditional expressions. In Simula, for example, the short-cut evaluation is explicitly controlled by the programmer via the logical operators `and then` and `or else`. In Turbo Pascal, compiler switches, `{$b+}` and `{$b-}`, would instruct the compiler to generate code for complete or truncated expression evaluation. C and C++ use short-cut evaluation in all cases.

All C/C++ programmers use short-cut boolean expression evaluation to conditionally enable parts of expressions. Indeed, it is not uncommon to see code that looks like

if (p && p->count) { // do stuff with p because // p is not null } else { // do something else }

The first part of the expression prevents the second part `p->count` from being evaluated if `p` is null. However, I think not all programmers explicitly use this feature to speedup tests by carefully resequencing/reorganizing the conditional expressions. Consider the following piece of code:

if ( really_slow_test(with,plenty,of,parameters) && slower_test(with,some,parameters) && fast_test // with no parameters ) { ...then code... }

This code first evaluates an expensive function then, on success, proceeds to evaluate the remainder of the expression. But even if the first test fails and the evaluation is short-cut, there’s a significant performance penalty because the fat `really_slow_test(...)` is evaluated. While retaining program correctness, one can rearrange the expression as follows:

if ( fast_test && slower_test(with,some,parameters) && (really_slow_test(with,plenty,of,parameters)) { ...then code... }

Now, the code is much faster because `fast_test` costs very little compared to the other two parts of the expression, and we proceed to evaluate `really_slow_test(...)` only if the two previous tests succeed. Ideally, the expressions should lend themselves to such reordering. In practice, there might be dependencies in the tests that force a natural ordering, regardless of which test is expensive or not. Think of our previous example with `p` and `p->count`; clearly, inverting the expression is not going to work. In many other cases, however, one can massage the expression to simplify it and to move the easiest, fastest, most likely to fail- (`&&`) or succeed-fast (`||`) tests to the front, allowing the expression’s value to be ascertained as fast as possible while being still correct.

The use of fail- or succeed-fast testing is, I think, a good, simple pratice that does not endanger code legibility and that does not lean too far on the side of premature optimization (and all its evils), yet yields interesting performance gains whenever the conditional expressions aren’t completely trivial.

Another thing to consider is branching cost which easily runs into 10s of cycles.

On the assembly level if ( a && b && c ) { block } translates to

If your tests are simple the penalties are proportionately larger. e.g.

You’re right. Branchings ARE expensive. However, if cascades aren’t a frequent programming style, and they might not help legibility all that much.

The second trick you present is valid when all variables are already assigned a truth value, or have unit cost of initialisation. && translates to &, || to |, and using De Morgan’s laws as you did allows to transform ((a==0)&&(b==0))==1 as ((a!=0)||(b!=0))==0 which is, indeed, (a|b)==0. I was going to talk about this in Part II, thanks for the spoilers ;)

If you do not have exactly 0 or 1 as your variables’ values, you can still demote them to zero anyway. Suppose you have ((a==3)&&(b==c))==1, you can rewrite this as (((a-3)==0)&&(((b-c)==0))==1, which, applying De Morgan’s law again, you transform to ((a-3)|(b-c)) == 0. All low-cost arithmetic, no extra jump due to short-cut evaluation.

can you please provide the alternate boolean minimization techniques other than k -map tabular method

There are other techniques, but they are ultimately all very complex (in the computational complexity sense). It is a NP-complete problem, see

The Complexity of Boolean Functionsby Ingo Wegener.A few video of courses on the subject:

There are others linked into the “suggestions.”

Let me describe a “brute force” technique to boolean optimization I came up. The concept is to calculate term evaluation speed and probability during runtime and adjust the order of evaluation dynamically. I’ll apply it to your example:

Here is the rewrite:

The cool stuff happens inside OptimizeExpression. The first iteration of the while loop constructs all the terms and captures the initial term timing. The start/end times of each expression are captured and sorted using some appropriate ranking ie. time/probability with a running average. The next iteration of the while loop will adjust the order of evaluation using the case statement assigned to each term. This executes the fastest terms with the highest probability of being false for AND expressions and true for OR expression. Considerations for using this are: 1) need to evaluate ALL terms in the expression to gather the statistics of unknown/changing execution times and probabilities 2) this expression has to be something you do a lot, ie. looping through this expression hundreds of times with different data. 3) the “while switch” construct should replace any “if” without scoping issues. 4) the term evaluation needs to be expensive to compensate for the overhead of this mechanism (although it has little impact I can see)

I would think it would be most useful when you gather the run-time behavior, then use the collected data to hard code the final result: like a profile-guided optimization. With gcc you can instrument the code to gather this data and the compiler is able to use it to generate better code.

Did you actually looked at the code generated / measured the overhead ?

Hello, Did you ever write the second part of this post (as per your comments you were thinking of doing so in which you would cover reduction of logical operators to arithmetic operators)

btw, I read your blog regularly, it is quite informative. Thanks!

I was to write a second part? …I don’t think I ever wrote it. I’ll put that on my to-do list.