## What’s the point? [6]

### March 11, 2013

Fine, I lied. Post [5] wasn’t the last one after all. I’d like to propose one more optimization to the Quadtree/Octree algorithm as explained in the previous post.

As you may recall, spatial trees are clever because they [A] allow us to quickly —almost trivially— reject large portions of a dataset. The recursive nature of the spatial tree means that if we can reject a specific node, we can also reject all of its child-nodes. Furthermore, because the tree has variable density it is very well suited to handle unbalanced point collections.

We can reject a node in the spatial tree when the shortest distance from the search locus to the boundary of that node is already larger than the best answer we’ve found so far. From this it follows that the smaller the node region, the more quickly we can reject a node. Typically the regions of all child-nodes equal the region of the parent-node. However it’s only important to have a tree without gaps between nodes if we’re still in the process of inserting points into the tree. When we’re done adding points we can perform two fairly easy optimizations that will improve the search speed (though not enough to change the runtime-complexity).

Optimization number one involves shrinking every node region to contain exactly the points inside of it. Or —if the node contains child-nodes rather than points— shrinking the node region to contain exactly the regions of the child-nodes. By shrinking the node regions to the bare minimum we will increase the shortest distance from the search locus to the node region, which will —sometimes— result in earlier rejection.

Optimization number two involves simplifying the tree structure by removing nodes that contain exactly one child-node. We can ‘detach’ the single-child and plug it into the same slot as its parent used to occupy. By removing these pointless nodes we reduce the time spend traversing the tree.

The image above shows the same Quadtree we build in the previous post, but now the nodes are colour-coded to reflect what sort of optimization we can apply to them. Orange nodes are empty and can be removed entirely. Green nodes contain points whose collective bounding-box is smaller than the node region. By removing empty nodes and shrinking the remainder, we end up with a vastly reduced total structure.

Once you apply these optimizations to a tree structure it is no longer possible to insert more points, as they might fall in the gaps that now exist between adjacent nodes. Unless you’re willing to make your point insertion smart enough to grow node regions on demand of course.

Seriously, I’ll stop harping on about spatial trees now.

March 11, 2013 at 21:01

Ha, this post-computation shrinking is a good idea.

Also, one point that I found making the MATLAB octree from the previous post’s comment is that at the step of partitioning a node, it may be useful to consider different partitions than simply “equally divide the space into 4 (or 8) subpartitions”.

As long as it’s not too computationally expensive, I found that dividing partitions at the *average* of all the points it contains would often mean I could reach a goal (say, having leaf nodes with fewer than 20 points in them) with fewer iterations (ie, with a shallower tree structure).

This will generally take a few more cycles to create the tree, but like the shrinking idea you’ve proposed, it results in more efficient partitioning of the space so that it can also save cycles when actually *using* the tree for subsequent tasks.

March 11, 2013 at 22:04

Indeed, in the implementation I’m writing now for Grasshopper there is an option to divide down the middle or at the average of all points, with the average being the default. I’m also considering a squared-average split, or median split, but once I’ve got profiling test cases up and running I can start to play with the details.

When I did testing in the past with octrees I found that 100~150 nodes per leaf was optimal. I was surprised to find that high a number. I’ll need to perform all new tests once this new algorithm is up and running though.