On getting lucky in higher dimensions
July 31, 2011
Pin the Tail on the Donkey is an all time favourite for children’s parties. For those of you not familiar with the game, a player is blindfolded and spun around until thoroughly disoriented. Then, cheered on by her friends, she must attempt to stick the ‘tail’ as close as possible to the proper location on the picture of a donkey on the wall. It’s not difficult to calculate the odds that a certain blind pick will yield a result close enough to win the game. If we assume that only picks within the boundary of the donkey image are allowed, the odds of making a winning pick are the area of the winning region (drawn in red), divided by the area of the entire image
We can simplify this by choosing sensible dimensions for both the boundary and the winning region. Let’s say the boundary is a square with an edge length of one units, while the winning region is a circle with a radius of half a unit. This means the boundary has an area of 1 units2, while the winning region has an area of pi × 0.52 ≈ 0.785 units2. Thus the chances of making a winning pick are roughly 80% in this simplified case.
But what are our chances if we’re supposed to pick a single point in three-dimensional space? Instead of dividing areas, we must now divide volumes. And instead of a square with a circle, we’re now dealing with a cube and a sphere.
Now, the volume of a unit cube is always 1.0, no matter how high the dimensionality. This is of course why it is a sensible value. The volume of a sphere in three dimensions is defined as (4/3) × pi × r3 which for a radius of 0.5 roughly equals 0.524 units3. So the volume of a sphere is much smaller compared to the volume of the containing cube than is the area of a circle compared to the area of the containing square. In other words, there’s a lot more left-over space. What happens if we go even higher than 3 dimensions?
Here you see for the first 12 spatial dimensions the ratio of volume of the (hyper)sphere* as a percentage of the volume of the (hyper)cube. With every additional dimension, the ratio is nearly halved. The tilted numbers next to the graph represent the volume of the hyper-sphere rounded to three decimal places and it doesn’t take many dimensions for the hyper-sphere to become utterly insignificant compared to the hyper-cube. This is probably somewhat counter-intuitive as the cube is still a tight fit around the sphere, but then things tend to get counter-intuitive with gay alacrity when more than 3 dimensions are involved.
The purpose of this post is not to tell you that you should pick a lower-dimensional reality from your opponents when playing Pin the Tail on the Donkey. Rather, Pin the Tail on the Donkey is an excellent analogy for computational random search algorithms. You don’t have to blindfold and spin computers, being ignorant and disoriented is their default state. It is actually removing the blindfold that typically requires an awful lot of difficult programming.
When an Evolutionary Solver (such as Galapagos) starts, it has absolutely nothing to go on. Eventually new generations are created from previous generations using a combination of breeding and selection and mutation algorithms, but the first generation has to be magicked into existence out of thin air. A common strategy is to completely randomize the individuals in the first generation, basically pinning tails onto the hyper-donkey. Since the dimensionality of the search-space equals the number of variables, there is a vanishing —indeed, rapidly vanishing— chance that any individual in Generation 1 is going to get lucky.
Thankfully only the first generation suffers from this problem, as subsequent generations use a far more directed (as opposed to random) search, but it is important that you allow for sufficient individuals in Generation 1, so that the chances of getting immediately stuck in a local optimum are small. This is why Galapagos has an Initial Boost factor setting, which multiplies the number of individuals in Generation 1.
* The formulas for hyper-sphere volumes are:
R2 → π · r2
R3 → (4/3) · π · r3
R4 → (1/2) · π2 · r4
R5 → (8/15) · π2 · r5
R6 → (1/6) · π3 · r6
R7 → (16/105) · π3 · r7
R8 → (32/945) · π4 · r8
R9 → (1/24) · π4 · r9
R10 → (1/120) · π5 · r10
R11 → (64/10395) · π5 · r11
R12 → (1/720) · π6 · r12