Chapter 3
Analysis

Simulation Framework

In order to design a simulation toolkit it is necessary develop a framework in which the different systems outlined in the previous chapter can be evaluated. The framework provides a general basis for relating specific system features to general evolutionary simulation concepts and is a starting point for the development of the class library requirements. In the following sections a number of criteria are described, and illustrated by applying them to the existing simulation systems.

Simulation model

The following attributes provide a framework for comparing different simulation models focusing on the abstractions rather than the implementation.

Abstraction level

Every simulation has some base level of objects which it stores and manipulates. The abstraction level refers to the real world objects that these primitive simulation objects represent. By analogy, in the field of neuroscience a useful framework is provided by the MacGregor-Lewis stratification [Cliff 90] depicted in Table 2.

level  description                   examples                       
A      molecular and chemical        chemical synaptic              
       processes                     transmission, gating           
                                     processes                      
B      membrane conductance          synaptic conductances,         
       modulations                   active dendritic               
                                     conductances, action           
                                     potential conductances         
C      electrical signals            field potentials, spike        
                                     trains, generator potentials   
D1     theoretical inferred          cell assemblies, statistical   
       constructs (biological)       configurations                 
D2     theoretical inferred          superego, id, drives,          
       constructs (psychological)    cognitive maps                 
E      behaviour, experience,        direct observations            
       consciousness

Table 2. MacGregor-Lewis Stratification

The same idea can be extended to classifying evolutionary simulation systems. Though there can be many levels of abstraction in between the ones shown in Table 3 the seven listed are adequate for comparing existing simulation systems.

level    description    examples                                   
1        atomic         carbon and hydrogen, cellular automata     
                        cell                                       
2        molecular      proteins, masses and springs               
3        cellular       neurons, abstract cells                    
4        organ          brain, leg, feeler, motor, sensor          
5        organism       P. Computatrix, bug, pet                   
6        population     predator population vs. prey population    
                        (foxes and rabbits)                        
7        ecology        food chains, Gaia

Table 3. Evolutionary Simulation Abstraction Levels

Abstraction depth

Abstraction depth refers to how many abstraction levels are spanned in a simulation. A program that models a population by simulating a collection of organisms has an abstraction depth of one. If, however, the organisms are collections of organs (say a vehicle with motors, sensors, and an artificial neural network to drive the behaviour) then the simulation would span two levels.

Abstraction depth is key to the concept of emergence. Emergence is a process in which a collection of interacting units acquires qualitatively new properties that cannot be reduced to a simple superposition of individual contributions [Taylor 92]. A simplistic example is the property of life arising from the interaction of billions of non-living molecules in an organism.

Emergence appears to be a dominant theme in the field of artificial life [Lewin 92]: emergence of structure from the interaction of many developmental modules (morphogenesis), emergence of behavior from the interaction of many neuronal elements, emergence of control within an ecosystem, and emergence of self-organizing dynamics in a selective system.

Abstraction depth is correlated with the accuracy of a simulation -- how close it is to modeling behavior in the real world. The deeper the simulation is, the less brittle it is. As a simple example, consider a zero depth simulation of a beaker of water. The program may consider many variables: the volume of water, temperature, salt content, etc. It may accurately predict the resulting temperature given a certain amount of heat is added but it is unlikely that it would model the effects of evaporation unless that was explicitly built into the model.

Now imagine the same simulation taken down a level so that instead of modeling the entire beaker as a single object, the water is simulated by a collection of small volumes of water. We would get the same results for the overall average temperature of the water after a certain amount of heat is applied to the system, but in addition the simulation would predict the formation of convection cells within the beaker, an emergent phenomenon [Nicolis 89]. If the simulation was taken down another level, to molecular physics, it may be possible to derive the effects of evaporation rather than building them into the model.

The cost of the increased depth is, of course, an exponentially higher requirement for computational time and space. It would be ideal to base the simulation at the sub-atomic level, then given a sufficiently accurate sub-atomic model to work from, everything else from molecules to cells to organisms on up should come for free. In practice this is wildly unfeasible. Modeling of the tertiary folding of a single protein molecule is difficult enough to occupy a powerful (10 MIPS) workstation by contemporary standards.

Metaphysics

Prusinkiewicz describes two types of mathematical structure models:

Models may be structure-oriented, focusing on the components of the developing structure, or space-oriented, capturing the whole space that embeds this structure. A model in the first category typically describes where each component in the structure is located. A model in the second category describes what is located at (or what is the state of) each point of space.

[Prusinkiewicz 93]

Space-oriented models include reaction-diffusion models and cellular automata while structure-oriented models include L-systems and map L-systems.

This analysis can be extended to simulations. A structure-oriented simulation typically describes where each primitive component of the system (molecule, cell, organism) is located while a space-oriented system describes what is located at each point of space. Hybrid systems are also possible. Cellular automata or a reaction-diffusion model may be used to track variables in the environment which are too low-level to be made into first class objects such as food, water, and pheromone concentration.

In addition, the simulated structures and the space that embeds them can be continuous or discrete. The state characterizing each point in the space may be taken from a continuous or discrete domain and can have any number of dimensions. The simulation may operate in continuous or discrete time.

Physics

Different simulations can be compared in terms of physical and biological plausibility. To what extent are laws of physics taken into account? Can two objects occupy the same space or is there some sort of collision detection and resolution? Do objects have mass? Are momentum and energy conserved? In a discrete space-oriented simulation such as a cellular automaton, questions relating to physics may have no relevance, but by the same token the simulation may be so far abstracted from the real world as to have diminished relevance. Five classes of increasing complexity are described in Table 4.

class   description                           
  A     abstract environment, no physics      
        per se                                
  B     physical space with collision         
        detection                             
  C     primitive physics - energy balance    
  D     advanced model - friction, etc.       
  E     full scale physically-based model

Table 4. Physics Classes

Biology

Similar questions can be asked from the domain of biology: How are phenotypes generated from genotypes (if at all)? Do organisms go through some kind of developmental process or are they instantly created? Do the genotypes have a fixed or variable length? Are generations in the population replaced all at once by an external mechanism or does is there local competition and mating? Five biological classes are shown in Table 5, again increasing in sophistication.

class   description                                              
  A     no biology per se                                        
  B     external genetic algorithm - global competition and      
        mating                                                   
  C     primitive biology - local competition and mating         
  D     advanced model - morphogenesis                           
  E     full scale physically-based developmental model

Table 5. Biology Classes

Psychology

Autonomous agents need some kind of program to drive their behavior contingent on their current state and input from the environment. A number of different control mechanisms are currently being explored by researchers: condition-action rules, combinational logic, spreading activation networks, finite state machines, Turing machines, feedforward and recurrent discrete- and continuous-time neural networks [Beer 92].

Simulation implementation

The following attributes provide a framework for comparing different simulation implementations focusing more on programmer and user issues rather than the abstractions.

Extensibility

Extensibility refers to how easy is it to extend the functionality of the simulation. Can the system be used to test theories beyond the programmer's original designs? Is it possible to add new types of organisms? Is it possible to change the environment? Can the evolutionary mechanisms be modified and enhanced? How much can be changed without having to modify the source code and recompile? If source code is provided with the system, how amenable is it to extension? Five classes corresponding to increasing levels of extensibility are outlined in Table 6.

class   description                                                 
  A     No modification possible.                                   
  B     Run-time parameters can be changed.                         
  C     Design time parameters can be changed. (source code         
        provided)                                                   
  D     Source code provided and designed for extension.            
  E     A complete evolutionary simulation toolkit designed for     
        extension.

Table 6. Extensibility Classes

Portability

Even with recent advances in open systems, portability is still an issue. If a system is written in a theoretically portable language (e.g. C), it may still make use of libraries dependent on certain systems, especially graphical user interface libraries (e.g. X11, MS Windows). For each system, the language it was written in and the operating system(s) it supports is noted. Special mention is made when a particular system makes use of non-portable hardware or software.

Data gathering

For research it is mandatory that the results of experiments be recorded for interpretation and statistical analysis. Therefore, when evaluating simulation systems it is important to note what facilities are provided for measuring the process. Which system variables can be recorded? Does the user have any control over the format? The systems are rated from 0 (no data gathering supported) to 5 (extensive).

Visualization

Another important aspect of simulation-based research is visualization. Visualization of the simulation results facilitates their interpretation [Prusinkiewicz 93] and is used as a method of evaluating models. Often there are no formal measures for emergent phenomena such as situational behaviors and population dynamics; therefore we rely on visual inspection to compare the simulation results with reality. The systems are rated from 0 (no visualization support) to 5 (extensive).

Evaluation of systems

In the following sections the simulation systems introduced in Chapter 2 are evaluated in the context of the framework developed. Summaries of the results of the evaluations appear in Table 7 and Table 8 at the end of this section.

Tierra

Tierra is unlike any of the other systems in that the environment is a discrete linear space. It is a space-oriented simulation: each cell in the "soup" contains an instruction; some contingent blocks of cells happen to make up a Tierran creature.

In terms of levels of abstraction, Tierra is implemented at the lowest level and spans the greatest number of levels of all systems evaluated. Though the instructions that make up a Tierran creature have no direct real-world counterpart, they are closest to the cellular level (level 3). After a few thousand generations, a rich ecology emerges with several species competing for resources in varied niches (level 7).

Since the environment is completely abstract, there is no need for physical laws beyond the time and space implied by the computer memory metaphor. The biology of the system is also primitive, not unlike prokaryotes at the dawn of life on Earth. There is evolution through reproduction and mutation, but no mating nor morphogenesis. It is important to note that Tierra does not preclude higher level biological mechanisms, they just are not built into the system explicitly.

Petworld

The Petworld environment is a 2-D structure-oriented discrete space. The pets represent the lowest abstraction level of the system, that of the organism (level 5). The simulation has a single level of depth up to small-sized populations (around ten individuals).

The physics of Petworld may be classified as low-end level C because an energy balance is implemented with the foraging rules: whenever a pet feeds on a tree, the tree loses "meal points" and vanishes when the count reaches zero. Pets cannot reproduce, so there is no evolution (class A biology).

Petworld 2.0 runs on the Apple Macintosh and was written in Allegro Common Lisp.

BrainWorks

The BrainWorks world is a 2-D structure-oriented continuous space. In the design of BrainWorks, it was felt that the geometry of the world made a good trade-off between complexity and simplicity. The continuous aspect of the environment allows for a realistic simulation of orientation which is fundamental concept in animal movement. Yet constraining the world to two dimensions kept the computational requirements low enough for a real-time, interactive simulation.

The turtles that inhabit the BrainWorks world are built from components: motors, sensors and neurons. This places the simulation within abstraction level 4. Though the initial system was designed to allow building of animals by hand, it evolved to support populations of turtles (abstraction depth 1) that reproduced with mutation (biology class B).

Bugland

The Bugland simulation has a space-oriented discrete environment, either 2- or 3-dimensional. Each cell in the space takes is state from a discrete set: it contains a bug, a bug trail of a certain colour, or is empty. The level of abstraction ranges from bug (level 5 - organism) to a collection of species of bugs (level 7 - ecology) giving Bugland an abstraction depth of 2.

The physics and biology of the simulation are abstract computational models. Bugland has space and time (both discrete) but no mass or energy or physical laws (class A physics). The organisms are Turing machines and the environment is their tape. Evolution is implemented by applying an external genetic algorithm with global competition and mating (class B biology).

P. Computatrix

The world of P. Computatrix is a structure-oriented continuous space. The choice of a physical model for the body of the simulated insect was a tradeoff between sufficient complexity to generate interesting behaviors and sufficient simplicity to avoid overwhelming the model with overly difficult sensori-motor control problems. Likewise, the choice of a physical model for the environment in which the insect is situated was a tradeoff between complexity and computational expense.

In the simplified physics, the velocity of an object is proportional to the force applied and inversely proportional to the mass of the object (F mv). Rotational velocity is treated similarly with force proportional to torque and inversely proportional to rotational inertia (F Iw). Time is divided into 20 ms segments. At each interval the sum of forces is calculated for each object in the system and their position and velocity are updated accordingly.

Figure 17. P. Computatrix The organism was designed from parts: legs, body and antennae (level 4) as shown in Figure 17. Each of the six legs may be up or down. When up, forces applied to the leg cause it to swing. When down, forces applied to the leg move the body. The behavior of the insect is driven by a continuous-time recurrent neural network. The antennae and mouth contain tactile and chemical sensors which provide input to the neural network which drives the behavior of the insect. The output of the net is translated into forces which move the legs and the mouth.

The environment may contain walls and any number of bricks and patches. Bricks are immovable rectangular objects with which the insect can interact physically. When an insect encounters a brick or wall, it bounces off and its tactile sensors are triggered. Patches are usually used to represent food which gives off a strength of odor proportional to the number of food units contained in the patch and the area of the patch. The odor diffuses through the environment and drops off with an inverse square law from the centre of the patch.

P. Computatrix has a simple metabolism. It starts with a finite amount of energy which is spent at a fixed rate. When the insect's mouth intersects with a food patch, a fixed number of food units is transferred from the patch to the insect's energy store. Repeatedly opening and closing its mouth will transfer additional units. If its energy store goes to zero, it dies.

The simulation was extended with a genetic algorithm in [Beer 92]. The GA was applied to an explicit representation of the insect's neural network; the threshold, time constant and connection rate of each unit was encoded in the genotype. The insects in the population were run in isolation so competition and mating were global (class B biology). Successful evolution of complex behaviors were demonstrated including chemotaxis (detecting and following a chemical gradient) and hexapod locomotion.

The implementation of the simulation (not including the genetic algorithm) contained about 5000 lines of Lisp (Flavors) and runs on the Texas Instruments Explorer II LX Lisp Machine. The genetic algorithm was implemented using the public domain GAucsd [Schraudolph 91] .

O System

The O System world is a discrete 2-dimensional structure-oriented environment populated by a single O and a finite number of food sources. Abstraction is at the level of the organism (level 5) and no higher levels are simulated (depth of 0).

An O's behavior is driven by a small (13 units) feedforward neural network. Sensory input is encoded by two input units representing the angle and distance to the nearest food source. The two output units are interpreted as one of four possible actions: turn left, turn right, move forward or stay still. These actions are automatically executed, no physics is simulated (class A). Since each O is run in isolation, mating and competition among the population is global (class B biology).

The papers describing the O System contain no information about the implementation.

RAM

The RAM environment is a 2-dimensional hybrid discrete space. The simulation is a hybrid because the environment subsystem is space-oriented while the species subsystem is structure-oriented. The base level of abstraction is provided by the species sub-system (a Lisp program), typically implemented at the level of an organism (level 5) but could range to the level of a population as in the simulation of mosquito populations mentioned in [Taylor 89]. Simulation depth ranges from populations to ecologies (2 levels).

It is difficult to classify the level of physics and biology embodied by the RAM system because it appears flexible enough to implement an arbitrary degree of detail in the processes. However from the descriptions of the applications of RAM, it is apparent that the level of physics and biology tended toward the very abstract (class A). The models simulated population dynamics (mating group formation, predator-prey interactions, and pest control). No mention was made of simulating physical laws or genetics.

RAM was implemented in T (a flavor of Lisp) and runs on an Apollo workstation (or presumably any platform that runs the T environment). At the time of the writing of the paper it was being ported to Common Lisp on the Apple Macintosh.

Genesys

The Genesys world is a structure-oriented 2-dimensional discrete grid. An ant is simulated in isolation as it tries to follow the increasingly sparse trail of food (abstraction level 5, no depth). No physical laws are simulated (class A physics) and the genetic algorithm uses global mating and competition (class B biology).

The main purpose of the system is compare two types of control mechanisms that drive the behavior of the ant in its trail-following task: a finite state machine (FSM) and a recurrent artificial neural network (ANN). Both types are explicitly encoded in the genotype of the ant. The FSM is represented as a binary-coded state transition table while the ANN is encoded as the weight, threshold and initial activation for each of 11 units.

Genesys/Tracker runs on a CM2 (16K-processor Connection Machine). A computer of this power is necessary to run an experiment with a population size of 64K over a hundred generations in a reasonable time.

AntFarm

AntFarm is an extension of the Genesys system described above. Like its predecessor, the environment is a 2-dimensional discrete space. In addition to ants and food, the environment can contain odors to represent pheromones. The fact that the odors diffuse through the environment implies a space-oriented model so AntFarm is a hybrid simulation.

The physics, like Genesys, is abstract but attempts were made to make the biological mechanisms more realistic in order to simulate the natural evolution of a biologically motivated task, specifically central-place foraging. Towards this end a more realistic genetic algorithm was implemented with local competition and mating and the organism was made more complex with a wider range of sensory input and action repertoire. The ant's behavior is controlled by a neural network with 36 input units, 21 hidden units, and 7 output units.

AntFarm is implemented in C++ and runs on the 64K-processor Connection Machine.

Polyworld

Polyworld is a structure-oriented 2-dimensional continuous environment. Even though it is rendered as a 3-dimensional environment for purposes of visualization, it appears as 2-dimensional to its occupants. The base abstraction is implemented at the level of the organism (level 5) with simulations ranging up to the level of interacting species giving Polyworld an abstraction depth of 2.

Most of the physics in Polyworld beyond collision detection is implemented in the equations governing an organism's metabolism which is in turn determined by its genotype. The amount of energy an organism can store is proportional to its size. Size also affects the rate the organism expends energy, as does its strength, speed, and the particular behavior being expressed at any given time. Organisms replenish their energy store by feeding on the food which grows in Polyworld or on other organisms.

An organism's behavior is controlled by a neural network which is encoded implicitly in its genotype. Rather than fixing the number of units in the network as in previous systems, the genotype specifies the number of neuronal groups and number of neurons in each group. In addition, the genotype encodes the initial bias of neurons, learning rate, connection density between groups and topological distortion between groups. The latter refers to the degree of disorder in the mapping of synaptic connections from one group to the next. A low valued topological distortion implies an ordered, contiguous mapping whereas a high distortion implies a random mapping from one layer to the next. The motivation behind this indirect representation was to reduce the length of the genotype and also to avoid biasing for any particular neural organization.

The implementation of Polyworld contains about 15000 lines of C++ code and runs on Silicon Graphics workstations. The simulation is not portable because it takes advantage of hardware graphics rendering to implement the organisms' visual mechanism.

     System       Abstr.   Abstr.   Meta-physics                   Physics   Biology   
                  level    depth                                                       
Tierra               3     4        1-D     discrete    space       A         A         
Pet World            5     1        2-D     discrete    structure   C         A         
BrainWorks           4     1        2-D     continuous  structure   C         B         
Bugland              5     2        2,3-D   discrete    structure   A         B         
P. Computatrix       4     1        2-D     continuous  structure   D         B         
O System             5     0        2-D     discrete    structure   A         B         
RAM                  5     2        2-D     discrete    hybrid      A         A         
Genesys              5     0        2-D     discrete    structure   A         B         
AntFarm              5     1        2-D     discrete    hybrid      A         C         
Polyworld            5     2        2-D     continuous  structure   C         D

Table 7. Simulation Model Attributes

Some relevant observations can be made by reviewing Table 7. The abstraction level of the surveyed simulations tended toward the high side, from level 3 (cellular) to level 5 (organism). At the same time abstraction depth tended to be quite shallow; Tierra was the only system surpassing a depth of 3. This seems to indicate the field is still immature. As the technology progresses we can expect to see simulations with lower levels of abstraction and higher depths. Most systems modeled the environment as a discrete, 2-D space. The more realistic simulations (P. Computatrix, Polyworld) used a continuous space but only Bugland ventured into the third dimension. Continuous domains may be necessary to evolve system with real-world applications such a robot controllers, however the surveyed systems demonstrate that discrete domains are sufficient for evolving interesting and complex behaviors. The physical and biological plausibility classes of systems also tended towards the low (unsophisticated) end, again indicating room for improvement. Tookits aimed at the domain of evolutionary simulation will have to address these deficiencies.

Table 8 shows no clear trends in the portability of the systems; hardware and operating system platforms are diverse, and the implementation language (where known) is split evenly between Lisp and C++. That the extensibility of the systems tended toward the low end (most were class B, flexible run-time parameters) merely reflects the fact the systems were designed as simulations rather that toolkits. In the cases where the data-gathering and visualization capabilities were known, they were impressive. These indicate important areas for toolkit requirements.

Design requirements

The toolkit must support both space-oriented and structure-oriented environments because neither tends to be a generalization of the other. The same is true for discrete vs. continuous environments.

Though some simulations deal with a single organism in isolation, this is a special case of the more general feature of simulating populations of agents. The toolkit must be able to simulate several populations of organisms of different types (species).

     System       Portability          Extensibility   Data          Visualization    
                                                       Gathering                      
Tierra             DOS/UNIX    C       C               5             3                
Petworld              MAC      Lisp    B               ?             4                
BrainWorks            MAC      ?       B               ?             4                
Bugland               PC       ?       B               ?             4                
P. Computatrix    TI Explorer  Lisp    B               ?             4                
O System               ?       ?       A               ?             ?                
RAM                 Apollo     Lisp    D               ?             4                
Genesys              CM21      C++     A               5             ?                
AntFarm               CM2      C++     B               5             ?                
Polyworld          SGI UNIX    C++     C               ?             5

Table 8. Simulation Implementation Attributes

Many systems used artificial neural networks to control the behavior of the simulated agents. This seems to be a natural choice because it simulates (to some extent) real biological control mechanisms and they are amenable to evolutionary processes. However the toolkit should allow for different types of control mechanisms so as not to preclude studies where different mechanisms are compared.

Another common feature is some sort of simulated evolution. Both global and local competition and mating should be supported by the toolkit. There should also be sufficient flexibility on the details of the genetic processes with respect to genotype length, genetic operators, and morphogenesis to allow for a broad range of evolutionary simulations.

All systems examined had some kind of visualization. A schematic overview of the environment was common. The toolkit should be able to display an animated map showing a graphical representation of each object in the world. A 3-dimensional rendering (with shading and texture mapping) is probably not necessary for most applications. It should also be easy to display the important system variables in various formats (e.g. line graphs, histograms) and save the data in a portable file format for further analysis with 3rd party programs such as statistics packages and spreadsheets.

From an analysis of the surveyed simulation systems, it is possible to abstract the common features. The archetypal artificial life simulation can be characterized by the following set of features:

an environment defining the virtual reality where the simulation occurs,
a population of agents interacting with each other and their environment,
a control mechanism to drive the behavior of each agent,
a mechanism which allows the agents to evolve through selection,
a visual display of the current state of the simulation including a spatial overview and ongoing graphical displays of salient variables.

A toolkit for artificial life simulations must provide all the aforementioned features, and do so in a manner which minimizes the effort of applications developers. Towards this end, the toolkit should also have the following set of features: the components should be

reusable,
understandable,
and extensible.

The reusability criterion is essentially the definition of a toolkit: a set of components that can be applied in the development of new applications. If the features of the toolkit are too complicated or incoherent to be understood by the developer, they will not be of much use. Finally, in a field as young as artificial life, no-one can possibly predict what a complete set of tools will look like, therefore it is imperative that the components be extensible so that they can be adapted to simulations unimagined by the original designer. A class library which implements these requirements will be designed in the next chapter.