March 7, 2011
Often the most difficult thing whilst setting up an Evolutionary Solver run is the definition of the Fitness Function. The sort of problem that is most suited for Evolutionary Solvers typically has a number (sometimes a very large number) of different variables that need to be solved. Sometimes these variables cooperate, in that improving one will also improve the other, sometimes they oppose and sometimes they are completely unconnected.
Imagine for example we are trying to put a collection of windows into a façade and we need to pay attention to the following properties:
- Daylight (the more daylight the better)
- Sunlight (the less direct sunlight the better)
- Solar Energy (the less heating due to solar radiation the better)
- Views (certain viewing directions are better than others)
- Cost (the cheaper the better)
Let’s call these five properties D, S, E, V and C respectively and let’s assume that we can encode V (Views) as a single scalar value instead of a complex set of view direction vectors. Some of these properties are (at least partially) linked. A decrease in window area will result both in less direct sunlight, less solar energy and probably less cost. Whereas an increase in window area will result in more daylight and probably better views. Basically we have five forces pulling the solution in different directions:
The most straightforward Fitness Function we could create for a system like this would look something like:
F = D + V – S – E – C
where the sign in front of each variable coheres with whether we want to maximize or minimize that particular variable. This certainly looks simple enough, but do not for a moment think that because the function looks simple, the solver progression is therefore also simple. What you don’t see is the relationship between the variables inside the fitness function and whatever input variables (genes) the Solver is allowed to work with. These relationships are almost certainly very complicated and interdependent. We can make certain generalized statements about this system based on the limited knowledge we have though:
- All variables are treated equally. It is exceedingly unlikely this is a good thing, as these variables do not have the same units. Solar Energy might be specified in kilowatt-hours, while sunlight may be defined in lumens/m² while cost and view are unit-less properties.
- The function is linear. Again, this is probably a bad thing. If making a small window bigger results in a higher fitness per this fitness function, it is likely that making a big window bigger will also result in a higher fitness. What this basically means is that the solution will tend to end up in either one of the extreme cases.
To counter the first problem, we have to introduce factors into the fitness function that pull our variables into the same ball-park. Take for example V (views) and S (direct sunlight). V might be a value ranging from 0.0 (I can’t see anything out of the window at all) to 1.0 (I can see all the cool landmarks from this office). S might be a value ranging from 0.0 lux (no direct sunlight whatsoever) to ~100,000 lux (full sunlight). This enormous discrepancy means that even the slightest change in direct sunlight will completely overrule whatever quality of view we might have in any given setup. We can counter this by multiplying S by a fudge factor, pulling it into the same range as V:
F = D + V – (0.00001 * S) – E – C
Now a change in view quality has at least a chance of affecting the Solver. Of course all the other variables will also need to be scaled accordingly.
The second problem can be solved in a number of different ways. The most obvious one of which is penalty clauses. A penalty can either be assigned as an incremental (smooth) adjustment or as a sudden threshold (discontinuous). An incremental penalty for example could translate into English as: “We care about daylight, but losing a little bit of daylight when we only have a little bit is a far worse than losing a litte bit of daylight when we have loads to go around.” In other words, we don’t want daylight to be a linear component of the Fitness Function. We want to penalize loss of daylight more severely when there’s very little daylight left to begin with. The original behaviour of the daylight component in the Fitness Function looked like this:
Where Δd denotes a change in daylight and Δf the corresponding change in fitness. No matter where you are, a specific loss in daylight results in a specific loss of fitness. What we want is something that looks like this:
Where a specific loss of daylight results in a much heavier fitness penalty when there’s less daylight to go around. There are a lot of mathematical functions and operators that can give you a curve like this, but the Square Root and the Logarithm are well known and reliable ones. If we insert a square root function into our Fitness Function it will look like this:
F = Sqrt(D) + V – (0.00001 * S) – E – C
In the next post we’ll have a look at predicting the pressures Fitness Functions exert on the Solver at various stages.