Computing the Neutral Zone

Have you ever add to decide, you and your colleagues, where to go for lunch? Each time, it ends up being a committee, of course. It gets even worse when not only you have many colleagues, but also two offices, or two groups, at different locations. Since we work in a rather large city, we want to walk to the chosen restaurant, rather than drive, but in a way that is fair to either group.

So to settle the argument about where are the restaurants midway of both locations, we need a map, and some math.

The first step is to get a map of the disputed locations. Google maps is an excellent source of maps, as it is both easily accessible and mostly accurate. Grabbing a screen-shot from the browser:

and you mark both starting locations (black, round, unnamed dots on the map).

The second thing is to decide how you measure distance. Will it be a $L_1$ measure or a $L_2$ measure? The $L_1$ measure is also know as the taxi-cab distance, measures distance as if moving along a discrete, orthogonal grid. Between two points $a$ and $b$ on a 2D-grid, the $L_1$ distance is given by:

$\|a-b\|_1=|a_x-b_x|+|a_y-b_y|$

That is, nothing fancier than the sum of absolute differences in each dimensions. Since the street grid is running s-w to n-e, we can shear the grid 45°—using, for example, $a'=(a_x,a_x+a_y)$. Applying the basic taxi-cab distance, we arrive at the following cut-off:

L1 distance.

The $L_2$ metric, on the other hand, translates to the usual euclidean distance, that is:

$\|a-b\|_2=\sqrt{(a_x-b_x)^2+(a_y-b_y)^2}$

which leads to the following map:

L2 Distance

if we compute, for each point, where the closest starting point is. Of course, we could have determined the cut using plain ol’ geometry. Tracing two circles large enough to intersect, each centered at a starting point:

Computing L2 cut, with compass and ruler (part I)

finding the intersections:

Computing L2 cut, with compass and ruler (part II)

and running a line through the two points:

Computing L2 cut, with compass and ruler (part III)

we would arrive to the same solution.

But in these two example, we assume, somehow, that we can fly from either starting point and land anywhere on the map. Well, turns out that I can’t fly, and none of my lunch-mates can either; so we have to walk following streets (or at least sidewalks, but let’s simplify the problem and assume it’s the same).

This means that to settle the argument for good, we should first extract the street network from the map and define a maximum search horizon:

The street network

Which will need some further post-processing to remove colors, artifacts and street names (just thresholding and dilation), yielding an imprecise, but workable street map:

Post-Processed Street Network (with Highways Removed)

Finally, we use a dynamic programming approach of finding, for each point on the map, the shortest path between these points and either starting point, following streets, no flying allowed. Superimposing the results on the original map:

Optimal, dynamic programming, street-aware L1 solution

from which we can plot the final neutral zone.

The final, Optimal Solution

The restaurants that are at the same distance from both offices lie somewhere on (or at least very close) to the dotted cut.

*
* *

Dynamic programming—in this case, the shortest path algorithm—is a powerful tool for computing exact solutions in some cases. However, the introductory-level textbook version of the Floyd-Warshall algorithm supposes that the graph is held in an adjacency matrix. In our case, each pixel is potentially a node in the graph, potentially connected to all other pixels.

This mean that for a picture $w$ pixels wide, $h$ pixel high, the adjacency matrix is a huge $(w+h)\times(w+h)$ matrix. In our case, $w\approx{}1700$ and $h\approx{}950$, making the adjacency matrix approach next to infeasible. Furthermore, once you realize that in this problem a pixel can only be connected to its immediate neighbors—4 or 8, depending on what kind of connectivity you allow—you see that the matrix is approach is wasteful as it is incredibly sparse.

A possible solution is to work with a sparse matrix object that allows efficient iterators so that the Floyd-Warshal algorithms only loop on non-zero entries.

Another solution is to grow the connected parts from the edges of the graph around the seeds. Each iteration, you only examine the pixels in the neighborhood of the graph that is already solved. The algorithm terminates when all reachable pixels have been reached. This approach reduces considerably the the storage needed—from $O( (w+h)^2)$ to $O(w\times{}h)$—while reducing the complexity as well. The complexity is therefore reduced, for an iteration, to a constant number of tests for each pixel around the perimeter; which is roughly $O(w+h)$, and there are at most $O(\max(w,h))$ iterations. The algorithm stops when all pixels have been reached.

2 Responses to Computing the Neutral Zone

1. rmn says:

Very elegant solution to a pretty common problem :)
Do you really have a working sourcecode for this? If you work on it a little more so it could fetch the original map as well, you could probably release it – I am sure many people will find it useful.

I liked the post, thanks :)

2. Steven Pigeon says:

Yes, I do have code for this, of course it’s just a “proof of concept” … that is, 100 LOCs barely holding together with duct tape :p For example, the start points *must* be on a street. If a start point falls in the middle of a black square, nothing happens.

Ideally, we should work with the vector versions of the streets, and solve the problem in street space, rather than, like I did, pixel space. The computational complexity would be reduced even more… by orders of magnitudes, probably.