March 11, 2013
Fine, I lied. Post  wasn’t the last one after all. I’d like to propose one more optimization to the Quadtree/Octree algorithm as explained in the previous post.
As you may recall, spatial trees are clever because they [A] allow us to quickly —almost trivially— reject large portions of a dataset. The recursive nature of the spatial tree means that if we can reject a specific node, we can also reject all of its child-nodes. Furthermore, because the tree has variable density it is very well suited to handle unbalanced point collections.
We can reject a node in the spatial tree when the shortest distance from the search locus to the boundary of that node is already larger than the best answer we’ve found so far. From this it follows that the smaller the node region, the more quickly we can reject a node. Typically the regions of all child-nodes equal the region of the parent-node. However it’s only important to have a tree without gaps between nodes if we’re still in the process of inserting points into the tree. When we’re done adding points we can perform two fairly easy optimizations that will improve the search speed (though not enough to change the runtime-complexity).
Optimization number one involves shrinking every node region to contain exactly the points inside of it. Or —if the node contains child-nodes rather than points— shrinking the node region to contain exactly the regions of the child-nodes. By shrinking the node regions to the bare minimum we will increase the shortest distance from the search locus to the node region, which will —sometimes— result in earlier rejection.
Optimization number two involves simplifying the tree structure by removing nodes that contain exactly one child-node. We can ‘detach’ the single-child and plug it into the same slot as its parent used to occupy. By removing these pointless nodes we reduce the time spend traversing the tree.
The image above shows the same Quadtree we build in the previous post, but now the nodes are colour-coded to reflect what sort of optimization we can apply to them. Orange nodes are empty and can be removed entirely. Green nodes contain points whose collective bounding-box is smaller than the node region. By removing empty nodes and shrinking the remainder, we end up with a vastly reduced total structure.
Once you apply these optimizations to a tree structure it is no longer possible to insert more points, as they might fall in the gaps that now exist between adjacent nodes. Unless you’re willing to make your point insertion smart enough to grow node regions on demand of course.
Seriously, I’ll stop harping on about spatial trees now.
March 9, 2013
Welcome to the —for now— last post in this series on algorithmic optimization. We talked a bit about generic optimization, then we introduced a specific problem and explored different ways of speeding up finding the solution to that specific problem. We talked about brute-force searching and why it’s not an option when we’re dealing with large point collections. We discussed one dimensional sorting and spatial bucketing and the merits and drawbacks of each algorithm. It is now time to discuss the ultimate of closest-point-searching algorithms, the weapon of choice for the informed and discriminating programmer; spatial trees.
But let us start by briefly rehashing the logic of the square-grid-search algorithm and why it’s not a particularly good one. By creating a grid of adjacent cells and assigning each point p in a collection C to whatever cell contains it, we gain the ability to explore C in a localized fashion. Although we may not know exactly which two points are direct neighbours, we do know which two cells are direct neighbours and therefore all the points in these cells can be treated as being in the same neighbourhood. This works splendidly while single cells do not contain too many points and while there aren’t too many completely empty cells. For balanced point collections we just need to pick a clever grid size, but for unbalanced collections we are rightly and truly screwed:
If we pick a large grid cell size then too many points end up in the same bucket and we’ve converted a smart search algorithm into a brute-force algorithm. If we pick a small grid cell size then there will be loads of empty cells and iterating over them will retard the search process.
If at this point you’re thinking “If brute-force search is a prohibiting factor in the case of grid cells with many points, why not put a smaller grid inside those cells?” you can feel real good about yourself. In fact, once we establish that it’s allowed to put a smaller grid inside the cell of a bigger grid, we can simplify the whole grid notion and only support grids that are composed of 2×2 cells. To create a nested grid structure like this, we only need a very short algorithm:
- Draw a rectangle around all points in C.
- If the number of points in this rectangle exceeds N, then subdivide the rectangle into four smaller rectangles.
- Assign each of the points in the parent rectangle to the appropriate child rectangle, and then repeat steps 2 & 3 for each child rectangle.
The above algorithm is recursive and the grid structure it generates is no longer one where all cells have the same size. If points are more densely packed in one region of space, the grid-structure will subdivide more often and it will result in smaller cells. If points are completely absent from another region of space the grid-structure will not subdivide at all and we won’t be plagued by a plethora of empty grid cells. If we apply this logic to an unbalanced point collection, we should get something like this:
Level 0 represents the top-most level of the grid. The number of points inside the outer boundary is too high and the region is split into four quadrants. The lower right quadrant still has too many points so it too is split into four quadrants. Then the process repeats one more time giving us a third level of subdivision. Now we have a grid containing ten cells of varying size containing the entire point collection. In fact two of the cells contain no points and can be removed from the grid structure in order to reduce memory overhead. The entire structure also contains three cells which host nested grids.
A spatial data structure like this is typically referred to as a tree in programming because the recursive and fractal nature of the algorithm resembles the branching patterns of biological trees. There are many different flavours of spatial trees, some more suited for specific problems, some easier to implement, some optimized for point insertion or deletion. What I’ve drawn above is usually called a Quadtree (or Q-tree) because it always subdivides into quadrants. If we extend this notion to 3D space and we have to subdivide a single box volume into eight sub-volumes and we call the resulting structure an Octree instead.
So how does searching in a Quadtree work? Remember that in order to perform a fast closest point search, we need to reject as many points as possible, limiting the search to only those points which are serious candidates. In some ways searching a Quadtree is easier than searching a fixed grid, one of the reasons being that we don’t actually have any empty cells. The search algorithm might work roughly like this:
- Build a tree T from the point collection C.
- Find the cell L in T which is closest to the search point h.
- Find the point p in L which is closest to h using —once again— brute-force search.
- Start recursing to sibling and parent cells to see if they contain points closer still.
- Every cell that has a minimum distance to h which is already larger than the smallest distance we found so far can instantly be skipped, as can all child-cells within it.
Step 5 is where the real magic happens. Spatial trees make it very easy to weed out enormous portions of C that are not near h. Adding points to an existing tree structure is also very easy, even if they are outside the outer bounds of the level 0 cell. All we need to do in those cases is insert another level underneath level 0 (which then by definition becomes level 1) in order to grow the reach of the tree. Every time we add such a bottom level we double the reach, so even inserting points which are really far away will not take intolerably long (unlike fixed grids which grow in a linear rather than exponential fashion).
As I mentioned in post #1, optimization is often a difficult process and code which is optimized often gains so much complexity as to become unreadable —at least without significant effort— by third parties. Only start optimizing code once you’ve determined beyond a shadow of a doubt where the bottleneck in your program is.
March 9, 2013
One of the main drawbacks of the adapted-sorted-brute-force search algorithm we developed in the previous post is that it only sorts the data along a single coordinate axis. For two dimensional data that means we’re ignoring half the information. For three dimensional data we’re ignoring twice as much information as we’re using to speed up the search. The more dimensions, the less efficient that searcher becomes.
The key to optimizing a Closest Point Search is to reject as many points as possible because measuring the distance between points takes time. It might not take a lot of time to compute the distance between two points, but if you want to find the nearest point in a collection of 10 million points for 1 million other points, you end up computing 107 * 106 = 1013 distances if you take the brute-force approach. Even if a single distance test only takes 5 nanoseconds it will still take 50,000 seconds, which is almost 14 hours. The adapted-sorted-brute-force (which wasn’t all that brute-force by the time we were done with it) managed to reject points because their distance measured along the sorting direction exceeded the best result we’ve found so far.
What we’d like to accomplish today is roughly the same, except we want to be able to reject points in all directions, not just one. Instead of sorting all the points, what we’ll do instead is put them in spatial buckets, and these buckets in turn will be defined in such a fashion as to allow very easy navigation to neighbouring buckets. In other words: a grid.
In the image above you can see the entire point collection represented by the black dots. Point h is represented by the red cross and we’re looking for the black dot closest to h. This search algorithm also requires a pre-process step, but instead of sorting all the points we’ll instead assign them to whatever grid square contains them. So instead of a single large collection of points (C), we end up with a grid structure where each grid cell contains a subset of C. This is a very cheap process as it only takes a single subtraction, multiplication and rounding operation per dimension per point*. It’s also very easy to multithread this pre-process as it belongs to a class of problems known as embarrassingly parallel. The algorithm —which we shall henceforth refer to as the square-grid-search algorithm— would work as follows:
- Create a grid of square cells that encompasses all of C.
- Assign all points in C to their respective grid cells.
- Find which cell r contains h (or which is closest to h if h is not within the grid bounds).
- Find the nearest point p in r to h using brute-force search.
- If r is empty, extend to search to all eight neighbouring cells and so on until at least one point is encountered.
- Find the collection of grid cells R which intersect with the search boundary (represented by the green circle).
- Iterate over all points in R using brute-force to see if any of them are closer than the early result.
The algorithm is fairly straightforward and easy to implement, but —like adapted-sorted-brute-force— is has serious drawbacks. It will perform very well on balanced point collections, but the performance heavily depends on the size of the grid cells. Too small cells and many of them will be empty, causing us to spend a lot of time iterating over pointless** areas. Too big and they’ll contain a lot of points which will slow down the algorithm as the intra-cell search is brute-force.
If the collection is unbalanced then we’ll end up with many empty cells and some cells which contain many points. As long as the intra-cell search remains brute-force, square-grid-search will not be suitable for such cases.
There is a new drawback as well. It’s quite simple and very fast to insert points into a sorted collection without breaking the sort order. It is also very easy to insert points into a spatial bucketing grid, provided the points are within the grid boundaries. If you wish to add points outside of the existing grid boundary, the grid will have to be grown, which can be a computationally expensive operation***.
Up next we’ll examine ways to make spatial bucketing more effective for unbalanced —indeed, wildly unbalanced— point collections.
* Assuming the Grid only has rectangular cells. Square cells make the algorithm much easier to implement though, not to mention hexagonal or free-form cells.
** Bow down before the Master of Puns!
*** It is possible to use a data structure known as a sparse array to store the grid cells rather than a standard multi-dimensional array. This effectively solves the insertion problem, but it also makes navigation between neighbouring cells slightly slower.