Fast Path Finding (part II)

Last week we had a first look at a simple path finding algorithm with application to games. This week, let us have a look a the relative performance of the implementations considered: C# with a naïve implementation that scans the whole map multiple times; a C# implementation that uses lists to maintain and visit only the border of the solved region (affording massive speed-ups, one hopes); and C++ versions of these two variants, plus a C++ variant that uses set (as bitmaps) to maintain the border list.

monster-small

In any quantitative study, the first thing to do is to collect data. The best way is to automate the collection, by, for example, adding a function that puts the player in all legal positions on the map, compute all the shortest paths, and measures how long it takes. One may also consider disabling the on-demand power policy as the CPU may jump (progressively?) in high gear as the data collection progresses, introducing bias.

(The policy setting script is simple:

#!/usr/bin/env bash

# enumerate found CPUs
cpus=$( grep processor /proc/cpuinfo | cut -d: -f 2 )

# set governor for each CPU
#
for cpu in ${cpus[@]}
do
 cpufreq-selector -c $cpu -g $1
done

and you invoke it with either performance (full CPU speed) or ondemand to put it back in variable speed.)

Let us consider the same map as before:

carte-05

The program is compiled with all optimizations enabled; the CPU is set to performance, the program is launched, the algorithm is selected, the map loaded, and then the data collection begins. As explained above, the player is placed in each legal position on the map and the time taken to compute all shortest paths is recorded, with microsecond accuracy.

The C# versions both take more than 1ms to solve, with the naïve version much slower than the minimal list version:

c#-compared

The C# naïve version takes about 5.4ms on average. It seems not that bad until you realize that the test does not run on a ATMega chip, but an i7, and that the map is merely 50×50 tiles. Even the “fast” version at 1.4ms did not impress me much. That’s what prompted me to re-implement everything in C++ and see what happened.

There are three C++ versions. The first two re-implement the C# versions. The third one replaces the next-to-visit tile list by a bitmap and avoids memory allocation to a greater extend (something you can’t control much in C# since it’s always new this and new that).

As expected, the C++ versions are much faster:

all-compared

With the C++ naïve algorithm being about as fast as the C# next-to-visit tile list version. The next-to-visit tile list C++ version is much faster, averaging 262µs. That’s 0.26ms. The C++ implementation of the next-to-visit tile list that uses a bitmap averages at 234µs, somewhat faster, but showing more variance.

c++compared

*
* *

No doubt I could optimize the algorithm a lot more by having a less object-oriented approach and by using flat arrays entirely (some arrays are still 2D), and even maybe doing some of the stuff directly in assembly language. However, at 234µs (from 5.4ms to 234µs) is a 23× speed-up!

I think I’ll stop here.

*
* *

So where does the speed-up come from? For one part, it’s simply using a language that has better performance than the original language. However, I do not know to which extend the results are influenced by the fact that I used dmcs, the Linux C# compiler based on Mono; but it is clear that the C++ versions are much faster, despite being a direct port: no fancy optimizations in the C++ versions, just a minimal translation from C# to C++. Only the bitmap-based version has a lot more optimizations (such as avoid copying arrays, using bitmaps instead of sets, etc.). In this case, we have a performance-critical application (finding all shortest path in a video game setting) that we brought to good performance 1) by choosing a better programming language (as far as performance is concerned; C# is, in many respects, much easier to use and makes clearer code than C++) and 2) by having a better algorithm (one that maintains only the tiles to be visited at a given iteration).

Choosing a better algorithm also gives quite the speed-up in the C# version, and we can safely say that it’s always better to have a better algorithm (tautologically so!).

One Response to Fast Path Finding (part II)

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: