Simulated Annealing, a brief introduction

October 14, 2011

Ever since version 0.8.0051 of Grasshopper, there is a second solver available from within Galapagos which implements the Simulated Annealing algorithm. Like the existing Evolutionary solver, Simulated Annealing is also a meta-heuristic technique, but works in a fundamentally different fashion. Having access to both solvers makes it easier to circumvent some of the shortcomings of each. Ironically, Simulated Annealing is a much simpler process than Simulated Evolution but may be harder to understand since the real-world analogy is more abstract and based on a less well known process.

In metallurgy, annealing is the process of controlled heating and cooling of metal to achieve certain material properties. At first, the metal is heated up to melting point so it can be cast or formed. At an atomic level, heat is nothing more than particle velocity. The particles (atoms & molecules alike) in a hot substrate move faster than the same particles in a cold substrate. At some point the velocity of two particles will be so high that they cannot succeed in forming a persistent bond between them. When this happens the substrate loses internal structure and turns liquid.

Atoms in liquid metal

Similarly, when a substrate is liquid but starts to cool down, there will come a point where the atoms can form lasting bonds and the internal structure of the substrate is resurrected, turning the liquid into a solid. These two processes are of course colloquially known as “melting” and “freezing”. However it’s not quite as simple as all this. In 1866 James Clerk Maxwell formulated the equations that described the distribution of particle velocities in a gas of constant macro temperature. For example, when you have a litre of helium gas at room-temperature it contains roughly 3 × 1022 atoms and the most probable speed of these atoms is a little over a 1000 meters per second. However there will be a lot of very slow atoms as well as a few much faster ones. This same phenomenon holds for liquids too, although the velocity distributions are not as well defined nor do they cover quite so large a range.

The upshot of all this is that when a substrate is allowed to cool, some atoms will cross the liquid|solid threshold before others. In other words, a substrate doesn’t freeze everywhere at once, small clumps of relatively slow atoms group together and form the freezing ‘seeds’. Especially when we’re talking about metals this is important because when metal atoms freeze, they like to form a regular lattice, or crystal.

Crystal seeds in semi-liquid metal

These small islands of glued together atoms grow over time as more and more slow atoms attach themselves to the seeds, allowing the micro-crystals to expand. When the cooling takes a long time, atoms are allowed to find the optimal (minimal energy) distribution. If on the other hand the cooling is very quick —for example by dumping the hot metal into water— there is no time to form large crystals and the substrate becomes amorphous.

Regular atomic lattice

So how does this help us find solutions in Galapagos? To be sure, the analogy is fairly tenuous but the simple fact remains that treating all the parameters in a problem as an atomic thermodynamic system allows us to find relatively good answers relatively quickly. Basically, with Simulated Annealing, we seek to crystallize the parameters into the lowest energy state.

A typical annealing run consists of a number of successive jumps in the problem phase-space, where the amplitude of each jump and its legitimacy are affected by the temperature of the system. When the temperature of the system is high, large jumps are allowed and there is a significant likelihood that a worse answer is adopted despite being a setback. The graph below shows a typical annealing track.

A schematic annealing track

Along the horizontal axis all possible states of the problem are collected. If the problem phase-space is one-dimensional (i.e. only a single Grasshopper slider), then this is an exact representation, but since this is a schematic representation the graph x-axis can represent any number of parameter dimensions. The vertical axis represents the fitness of each distinct state, so the thick black curve represents the entire fitness landscape. This particular landscape has 4 local optima with varying degrees of quality.

The first annealing jump must start at a random location because nothing is known about the landscape so no informed decision can be made. In this particular example, the annealing track starts along the left edge of the landscape at location (1). At this point in time the temperature is very high, so large jumps across phase-space are allowed. A jump from (1) to (2) would indeed qualify as large as it spans almost the entire phase-space. Since (2) is higher than (1), we automatically accept the new solution. Now we need to jump again, but there is less energy available as the entire system is cooling down. So the jump from (2) to (3) is most likely going to be shorter than the jump from (1) to (2). The fitness at (3) happens to be lower than (2), so the second jump actually represents a worse answer. However, since the temperature is still relatively high, we’re sometimes (where “sometimes” is in accordance with thermodynamic stochastics) forced to accept a worse case. This is not as stupid as it sounds though, as this characteristic makes the solver less prone to getting ‘stuck’ to a local optimum.

As time goes on, the temperature of the system drops and smaller and smaller jumps are possible. Also, at low temperatures the chance that a worse case is adopted over a better one becomes insignificant. Eventually the temperature has dropped far enough for the entire system to be frozen, at which point the best answer from the run is cached and a second run is started, once again at high temperature.

Because of the cooling schedule, it means that every annealing run only has a limited number of steps and therefore terminates in a limited time-span. Simulated Evolution does not have this characteristic; it will happily churn away for all eternity without ever looking back. This fundamental difference between the two algorithms makes Simulated Annealing a good candidate for finding many promising optima, while Simulated Evolution is much better at refining a single solution.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s

%d bloggers like this: