Have you ever add to decide, you and your colleagues, where to go for lunch? Each time, it ends up being a committee, of course. It gets even worse when not only you have many colleagues, but also two offices, or two groups, at different locations. Since we work in a rather large city, we want to walk to the chosen restaurant, rather than drive, but in a way that is fair to either group.
So to settle the argument about where are the restaurants midway of both locations, we need a map, and some math.
The first step is to get a map of the disputed locations. Google maps is an excellent source of maps, as it is both easily accessible and mostly accurate. Grabbing a screen-shot from the browser:
and you mark both starting locations (black, round, unnamed dots on the map).
The second thing is to decide how you measure distance. Will it be a measure or a measure? The measure is also know as the taxi-cab distance, measures distance as if moving along a discrete, orthogonal grid. Between two points and on a 2D-grid, the distance is given by:
That is, nothing fancier than the sum of absolute differences in each dimensions. Since the street grid is running s-w to n-e, we can shear the grid 45°—using, for example, . Applying the basic taxi-cab distance, we arrive at the following cut-off:
The metric, on the other hand, translates to the usual euclidean distance, that is:
which leads to the following map:
if we compute, for each point, where the closest starting point is. Of course, we could have determined the cut using plain ol’ geometry. Tracing two circles large enough to intersect, each centered at a starting point:
finding the intersections:
and running a line through the two points:
we would arrive to the same solution.
But in these two example, we assume, somehow, that we can fly from either starting point and land anywhere on the map. Well, turns out that I can’t fly, and none of my lunch-mates can either; so we have to walk following streets (or at least sidewalks, but let’s simplify the problem and assume it’s the same).
This means that to settle the argument for good, we should first extract the street network from the map and define a maximum search horizon:
Which will need some further post-processing to remove colors, artifacts and street names (just thresholding and dilation), yielding an imprecise, but workable street map:
Finally, we use a dynamic programming approach of finding, for each point on the map, the shortest path between these points and either starting point, following streets, no flying allowed. Superimposing the results on the original map:
from which we can plot the final neutral zone.
The restaurants that are at the same distance from both offices lie somewhere on (or at least very close) to the dotted cut.
Dynamic programming—in this case, the shortest path algorithm—is a powerful tool for computing exact solutions in some cases. However, the introductory-level textbook version of the Floyd-Warshall algorithm supposes that the graph is held in an adjacency matrix. In our case, each pixel is potentially a node in the graph, potentially connected to all other pixels.
This mean that for a picture pixels wide, pixel high, the adjacency matrix is a huge matrix. In our case, and , making the adjacency matrix approach next to infeasible. Furthermore, once you realize that in this problem a pixel can only be connected to its immediate neighbors—4 or 8, depending on what kind of connectivity you allow—you see that the matrix is approach is wasteful as it is incredibly sparse.
A possible solution is to work with a sparse matrix object that allows efficient iterators so that the Floyd-Warshal algorithms only loop on non-zero entries.
Another solution is to grow the connected parts from the edges of the graph around the seeds. Each iteration, you only examine the pixels in the neighborhood of the graph that is already solved. The algorithm terminates when all reachable pixels have been reached. This approach reduces considerably the the storage needed—from to —while reducing the complexity as well. The complexity is therefore reduced, for an iteration, to a constant number of tests for each pixel around the perimeter; which is roughly , and there are at most iterations. The algorithm stops when all pixels have been reached.