I’ve written a paper for the ICGG2014 conference to go with my talk. It mostly talks about the theoretical side of things, whereas the talk is going to focus on the pragmatic. In case they won’t accept it I’ll post it here soon, but if they do then it will be available online sometime after the conference.

Over the past few years I’ve grown increasingly tired of slick computer generated graphics. They all too often fail to draw attention to salient details and convey meaning. It’s rare these days that I get to make large illustrations, most of my graphic work is for icons, so I do try and savour it when I get the chance. I’ve put the 10 images I drew for this publication below the fold (with LaTeX overlays and captions).

Read the rest of this entry »

What’s the point? [5]

March 9, 2013

Welcome to the —for now— last post in this series on algorithmic optimization. We talked a bit about generic optimization, then we introduced a specific problem and explored different ways of speeding up finding the solution to that specific problem. We talked about brute-force searching and why it’s not an option when we’re dealing with large point collections. We discussed one dimensional sorting and spatial bucketing and the merits and drawbacks of each algorithm. It is now time to discuss the ultimate of closest-point-searching algorithms, the weapon of choice for the informed and discriminating programmer; spatial trees.

But let us start by briefly rehashing the logic of the square-grid-search algorithm and why it’s not a particularly good one. By creating a grid of adjacent cells and assigning each point p in a collection C to whatever cell contains it, we gain the ability to explore C in a localized fashion. Although we may not know exactly which two points are direct neighbours, we do know which two cells are direct neighbours and therefore all the points in these cells can be treated as being in the same neighbourhood. This works splendidly while single cells do not contain too many points and while there aren’t too many completely empty cells. For balanced point collections we just need to pick a clever grid size, but for unbalanced collections we are rightly and truly screwed:

Large cells vs. small cells in unbalanced point collections.

Large cells vs. small cells in unbalanced point collections.

If we pick a large grid cell size then too many points end up in the same bucket and we’ve converted a smart search algorithm into a brute-force algorithm. If we pick a small grid cell size then there will be loads of empty cells and iterating over them will retard the search process.

If at this point you’re thinking “If brute-force search is a prohibiting factor in the case of grid cells with many points, why not put a smaller grid inside those cells?” you can feel real good about yourself. In fact, once we establish that it’s allowed to put a smaller grid inside the cell of a bigger grid, we can simplify the whole grid notion and only support grids that are composed of 2×2 cells. To create a nested grid structure like this, we only need a very short algorithm:

  1. Draw a rectangle around all points in C.
  2. If the number of points in this rectangle exceeds N, then subdivide the rectangle into four smaller rectangles.
  3. Assign each of the points in the parent rectangle to the appropriate child rectangle, and then repeat steps 2 & 3 for each child rectangle.

The above algorithm is recursive and the grid structure it generates is no longer one where all cells have the same size. If points are more densely packed in one region of space, the grid-structure will subdivide more often and it will result in smaller cells. If points are completely absent from another region of space the grid-structure will not subdivide at all and we won’t be plagued by a plethora of empty grid cells. If we apply this logic to an unbalanced point collection, we should get something like this:



Level 0 represents the top-most level of the grid. The number of points inside the outer boundary is too high and the region is split into four quadrants. The lower right quadrant still has too many points so it too is split into four quadrants. Then the process repeats one more time giving us a third level of subdivision. Now we have a grid containing ten cells of varying size containing the entire point collection. In fact two of the cells contain no points and can be removed from the grid structure in order to reduce memory overhead. The entire structure also contains three cells which host nested grids.

A spatial data structure like this is typically referred to as a tree in programming because the recursive and fractal nature of the algorithm resembles the branching patterns of biological trees. There are many different flavours of spatial trees, some more suited for specific problems, some easier to implement, some optimized for point insertion or deletion. What I’ve drawn above is usually called a Quadtree (or Q-tree) because it always subdivides into quadrants. If we extend this notion to 3D space and we have to subdivide a single box volume into eight sub-volumes and we call the resulting structure an Octree instead.

So how does searching in a Quadtree work? Remember that in order to perform a fast closest point search, we need to reject as many points as possible, limiting the search to only those points which are serious candidates. In some ways searching a Quadtree is easier than searching a fixed grid, one of the reasons being that we don’t actually have any empty cells. The search algorithm might work roughly like this:

  1. Build a tree T from the point collection C.
  2. Find the cell L in T which is closest to the search point h.
  3. Find the point p in L which is closest to h using —once again— brute-force search.
  4. Start recursing to sibling and parent cells to see if they contain points closer still.
  5. Every cell that has a minimum distance to h which is already larger than the smallest distance we found so far can instantly be skipped, as can all child-cells within it.

Step 5 is where the real magic happens. Spatial trees make it very easy to weed out enormous portions of C that are not near h. Adding points to an existing tree structure is also very easy, even if they are outside the outer bounds of the level 0 cell. All we need to do in those cases is insert another level underneath level 0 (which then by definition becomes level 1) in order to grow the reach of the tree. Every time we add such a bottom level we double the reach, so even inserting points which are really far away will not take intolerably long (unlike fixed grids which grow in a linear rather than exponential fashion).

As I mentioned in post #1, optimization is often a difficult process and code which is optimized often gains so much complexity as to become unreadable —at least without significant effort— by third parties. Only start optimizing code once you’ve determined beyond a shadow of a doubt where the bottleneck in your program is.

What’s the point? [4]

March 9, 2013

One of the main drawbacks of the adapted-sorted-brute-force search algorithm we developed in the previous post is that it only sorts the data along a single coordinate axis. For two dimensional data that means we’re ignoring half the information. For three dimensional data we’re ignoring twice as much information as we’re using to speed up the search. The more dimensions, the less efficient that searcher becomes.

The key to optimizing a Closest Point Search is to reject as many points as possible because measuring the distance between points takes time. It might not take a lot of time to compute the distance between two points, but if you want to find the nearest point in a collection of 10 million points for 1 million other points, you end up computing 107 * 106 = 1013 distances if you take the brute-force approach. Even if a single distance test only takes 5 nanoseconds it will still take 50,000 seconds, which is almost 14 hours. The adapted-sorted-brute-force (which wasn’t all that brute-force by the time we were done with it) managed to reject points because their distance measured along the sorting direction exceeded the best result we’ve found so far.

What we’d like to accomplish today is roughly the same, except we want to be able to reject points in all directions, not just one. Instead of sorting all the points, what we’ll do instead is put them in spatial buckets, and these buckets in turn will be defined in such a fashion as to allow very easy navigation to neighbouring buckets. In other words: a grid.


In the image above you can see the entire point collection represented by the black dots. Point h is represented by the red cross and we’re looking for the black dot closest to h. This search algorithm also requires a pre-process step, but instead of sorting all  the points we’ll instead assign them to whatever grid square contains them. So instead of a single large collection of points (C), we end up with a grid structure where each grid cell contains a subset of C. This is a very cheap process as it only takes a single subtraction, multiplication and rounding operation per dimension per point*. It’s also very easy to multithread this pre-process as it belongs to a class of problems known as embarrassingly parallel. The algorithm —which we shall henceforth refer to as the square-grid-search algorithm— would work as follows:

  1. Create a grid of square cells that encompasses all of C.
  2. Assign all points in C to their respective grid cells.
  3. Find which cell r contains h (or which is closest to h if h is not within the grid bounds).
  4. Find the nearest point p in r to h using brute-force search.
  5. If r is empty, extend to search to all eight neighbouring cells and so on until at least one point is encountered.
  6. Find the collection of grid cells R which intersect with the search boundary (represented by the green circle).
  7. Iterate over all points in R using brute-force to see if any of them are closer than the early result.

The algorithm is fairly straightforward and easy to implement, but —like adapted-sorted-brute-force— is has serious drawbacks. It will perform very well on balanced point collections, but the performance heavily depends on the size of the grid cells. Too small cells and many of them will be empty, causing us to spend a lot of time iterating over pointless** areas. Too big and they’ll contain a lot of points which will slow down the algorithm as the intra-cell search is brute-force.

If the collection is unbalanced then we’ll end up with many empty cells and some cells which contain many points. As long as the intra-cell search remains brute-force, square-grid-search will not be suitable for such cases.

There is a new drawback as well. It’s quite simple and very fast to insert points into a sorted collection without breaking the sort order. It is also very easy to insert points into a spatial bucketing grid, provided the points are within the grid boundaries. If you wish to add points outside of the existing grid boundary, the grid will have to be grown, which can be a computationally expensive operation***.

Up next we’ll examine ways to make spatial bucketing more effective for unbalanced —indeed, wildly unbalanced— point collections.

* Assuming the Grid only has rectangular cells. Square cells make the algorithm much easier to implement though, not to mention hexagonal or free-form cells.

** Bow down before the Master of Puns!

*** It is possible to use a data structure known as a sparse array to store the grid cells rather than a standard multi-dimensional array. This effectively solves the insertion problem, but it also makes navigation between neighbouring cells slightly slower.

What’s the point? [3]

March 9, 2013

Remember the sorted brute-force search algorithm we came up with last time? It attempted to outperform regular brute-force searching by sorting the points left to right and then skipping as many points as possible by deciding that the best possible answer has already been found. We also mentioned that sometimes it makes more sense to iterate from right-to-left, or top-to-bottom or bottom-to-top depending on the relative location of the search locus (h) with respect to the point collection (C) as a whole.

But in many cases we have to work our way from one extreme of C to the point nearest to h before we can seriously start to omit points. If only we could start somewhere in the vicinity of h then the chances of finding the best possible answer quick will be much higher, which in turn means we can skip more points, which in turn means we’ll be done quicker. So an improved sorted brute-force search algorithm might look like:

  1. Sort the points in C by ascending x-coordinate (i.e. from left to right).
  2. Starting from a point p in C somewhere in the vicinity of h, iterate towards the right and compute the distance to h.
  3. If this distance is less than the previous smallest distance, remember this new distance as the smallest distance and remember p as the best result so far.
  4. If the x-coordinate of p is higher than the x-coordinate of h + the current smallest distance then stop iterating.
  5. Starting from the point p in C that was in the vicinity of h, iterate towards the left and compute the distance to h.
  6. If this distance is less than the previous smallest distance, remember this new distance as the smallest distance and remember p as the best result so far.
  7. If the x-coordinate of p is lower than the x-coordinate of h – the current smallest distance then stop iterating.
  8. When we’re done iterating, we’ll have remembered the one nearest to h.

The only difference between this adapted search algorithm and the original is that instead of iterating from one end of C towards the other, we start somewhere in the middle, at a point that is already near h. But how do we quickly find a point near h? Wasn’t that what we were trying to accomplish in the first place?

Well, not quite. We’re looking for the point closest to h. But for this adapted algorithm to work, all we need is a point somewhere in the vicinity of h. As long as it’s not way off, we should be able to get some benefit out of our smarter iteration logic. So the trick now is to find this magical point in the vicinity of h (let’s call it pbase) and of course we need to do so quickly because we’re trying to safe time, not spend it.

There is a more rigid definition of pbase than “in the vicinity of” and it may give a clue as to how we’re supposed to find it quickly. pbase can be defined as that point p in C whose x-coordinate is most similar to the x-coordinate of h. In other words, for this initial guess, we’re happy looking only at the distribution of C in the x-direction, ignoring the fact that C is a two-dimensional point collection and thus also has a distribution in the y-direction. There is a fantastically clever and fast and easy way to find a value in a sorted collection that is close to some other value. It is called Binary Searching.

A binary search progresses by taking a series of decreasing steps towards the right answer. It only requires that the data it is stepping over is sorted. A major benefit of binary searching is that the runtime complexity is O(log N), which is a very expensive way of saying that as the collection gets larger, it doesn’t take up correspondingly more time. Imagine for example that we use binary search to find an item in a collection containing 100 items. It will take —on average— N milliseconds to complete. If we now apply the same algorithm to a collection ten times the size, i.e. 1000 values, it might only take 2×N milliseconds rather than 10×N. If we make the collection bigger still, say 10000 values, it will take only 4×N milliseconds rather than 100×N.

Put even simpler, binary search —and other algorithms with a logarithmic runtime complexity— take proportionally less time to complete big problems as opposed to small problems.

Let’s have a look at how a binary searcher might find pbase in our point collection. The image below shows the points in C as grey circles, sorted from left to right. Note that the y-coordinate of each point  —while drawn— is irrelevant. Below the points are drawn the four steps that a binary searcher performs in order to arrive at the conclusion that point #3 is closest to h (×) if you only take the x-distribution into account. By a happy coincidence, point #3 is also the closest point in general, but that’s not necessarily the case.


A binary searcher always operates on the same basic principle, which is then recursively applied over and over until a solution is found. We know that h is somewhere in between the extremes of the entire collection. We can ascertain this by making sure that h is to the right of the first point (0) and to the left of the last point (19). So we take the point halfway in between 0 and 19 and test whether it is to the left or to the right of h. That point happens to be #10 and it happens to be to the right of h. So now we know that the real answer must be somewhere between 0~10.

Once again we test the point halfway between 0 and 10, which is point #5. Same deal as before, 5 is to the right of h and we’ve now shrunk our search range to 0~5. If we look halfway between 0 and 5, we find point #2 which happens to be to the left of h, so instead of adjusting the upper limit of the search range we now adjust the lower and end up with a range of 2~5. Repeat until you can move neither left nor right without making things worse.

As you can see, the search range is effectively cut in half with every step. So when you double the number of items, it only takes one extra step to complete the search. Quadruple the number of items and it only takes two additional steps. This is why binary search is a logarithmic algorithm.

We’ve now developed a closest point search algorithm which actually runs pretty fast, especially on balanced point collections. We’ve gone from brute-force, to sorted-brute-force, to adapted-sorted-brute-force which is hardly brute-force at all any more.

However there are still deep problems with the core algorithm and if we want to address them we’ll have to drastically rethink our approach. The main issue is that the sorting only happens along a single axis. With well balanced, two-dimensional points that isn’t too big a deal, but if you start dealing with three or more dimensions it becomes increasingly less efficient.

Up next, spatial buckets and how they can be used to generate a notion of locality.