John Perry : More on Core Wars Genetics


John Perry : More on Core Wars Genetics

Tue, 21 Apr 92 09:06:19 -0700

Hi again,

As promised,  I finally  found a  later revision  of the  paper in LaTeX format.
Much of it was just  a rewrite of the original  term paper, but when it  gets to
the  methods and  conclusions there  is some  new information.  The  experiments
described in the original paper are referred to as "phase 1" in this paper.  For
brevity's  sake, I've  left out  the sections  of this  which were  rewrites of
sections of the other paper, and I've indicated as such in the paper.  

I also found the code, but I have  to tease apart the part of the code  which is
mine from  the part  of the  code which  is source  code for  Buckley's Core War
Colliseum, since I don't  have copyrights to that  code.  It should be  out in a
couple of weeks.  Would one of you be  so kind as to let me know where  to place
an FTP upload on soda?   That way, when I get  the code ready, I can  put it and
both papers in an archive or zip file and load it all up as one file to soda.

Anyway, here's the paper.   As I said it's  in LaTeX format.  If  you don't have
LaTeX, send me a note back and I'll send you a copy in postscript format.

					Later,
					John


This paper documents an ongoing research effort in the field of artificial life.
We demonstrate the  evolution of predatory  behavior in computer  programs.  The
game  Core  War provides  an  environment for  fitness  testing, with  evolution
implemented by a genetic algorithm.  We refer to the union of GAs and Core  Wars
as the Core Life environment.  Results  presented here show how random code  can
evolve into successful Core War programs in only a few generations.


Introduction

I left this intro out, since it was basically a rewite of the intro to the other
paper.


Description of the Core Life Model

I left this intro out, since it was  basically a rewite of the ``Core Wars as  a
Binological Model'' section of the other paper.


Methods

We  actually  used  two  methods  to  fitness  test  warriors.   In  the initial
experiments,  all  of  the  warriors  of  a  population  battled  against common
opponents.  These  opponents were  warriors (written  by humans)  which had been
entered in  previous Core  Wars tournaments.   Both successful  and unsuccessful
warriors were chosen so that progress might  be seen in the form of the  genetic
warriors defeating better  and better opponents.   Each warrior in  a generation
fought the same number  of battles against the  same opponents, but with  random
starting locations for  each battle.  In  this way average  warriors could still
score highly if they happened to get advantageous starting positions\footnote{We
felt that this normalization factor emulates the real world very well, since the
very strong and the very smart aren't always the most successful organisms.  For
example, it is  possible for a  fairly mediocre individual  to become U.S.  Vice
President}.   The top  10% of  warriors in  each  generation  were  selected  as
champions, and thus allowed to reproduce.

This method of  fitness testing turned  out to be  very successful, but  perhaps
unrealistic since the opponents provided  a sort of evolutionary goal.   This is
not to say that biological life forms never deal with man-made antagonists,  but
rather that the ideal competition for  evolving life forms are its sibling  life
forms.  Thus we derived the other mode of fitness testing, whereby each  warrior
battled against its neighboring warriors.  This eliminated the artificial  goal,
although  there were  still the  inherent goals  of survival  and reproduction.
Since success in the MARS was  tied to reproduction, we predicted that  we would
still see evolutionary progress toward predatory behavior, but at a slower rate,
since the motivating factors weren't quite as overt.  For clarity, we will refer
to these sets of experiments as phase~1 (human-generated opponents) and  phase~2
(sibling opponents). Most of the methods were common between the two phases.

In our experiments,  genetic manipulation was  done at two  levels: crossover at
the instruction level, and mutation at the bit level.  Mutation was done at  the
bit level  to assure  that the  entire space  of possible  warrior programs  was
reachable.  Crossover is  a much more  frequent operation, however,  and Redcode
assembly language isn't particularly robust (for  example it has a 4 bit  opcode
field, but only 10 legal instructions).   We felt that if crossover was  done at
the bit level, there would be  too high a percentage of non-functional  warriors
in each generation.   By performing crossover  at the instruction  level, it was
still possible to  produce non-functional warriors  with this genetic  operator,
but not invalid instructions.  In a  way, instructions make more sense as  genes
than bits, since they carry more information.

In each  generation, the  entire population  was replaced  by reproducing with a
genetic algorithm.  A  biased roulette wheel  method was used  to select parents
from the champions of the previous generation, with each warrior given a percent
chance of reproduction  corresponding to its  total fitness score.   In phase~1,
each mating of two warriors produced two progeny in 
the following manner:


     First, instructions are copied one-by-one from one of the parents to one of
     the children.   As each  instruction is  copied, there  is a  random chance
     (.05%) that mutation  is performed.  Mutation  consists of flipping  one of
     the 40 bits in the instruction (randomly selected).  After each instruction
     is copied there is a random chance (25%) of a crossover.
      
     After the first  crossover occurs, instructions  are copied from  the other
     parent to the other child.  Again, each instruction is tested for  mutation
     and after each instruction is copied there is a random chance of crossover.
      
     After the next crossover, instructions are copied from the second parent to
     the first  child, one-by-one,  with mutation  and crossover  tested for  as
     before.

     Next, instructions  are copied  from the  first parent  to the       second
     child, with mutation and crossover tested for.  When crossover is chosen at
     this point,  it goes  back to  the first  parent being  copied to the first
     child.

     This process continues until all of the instructions from both of       the
     parents  are copied  to the  children.  Finally,  the lengths  of the  two
     children are determined and a random starting address is selected for each.


This method allowed multiple crossovers  at random locations in the  code. Since
the maximum warrior length  we allowed was 64  instructions, we would expect  an
everage of  about $64  \times .25  = 16$  crossovers per  mating in  the longest
warriors.

The reason we used such a high crossover rate is strictly computational. Due  to
space limitations, our  population sizes were  1000 warriors per  generation for
phase~1, and 900 per generation for phase~2.  Since the maximum warrior size for
these experiments was 64 instructions,  and Redcode machine instructions are  40
bits long, the maximum number of bits per warrior was 2560.  This means the size
of the space being searched is $ 2^{2560} \approx 4.33 \times 10^{769} $,  which
makes the population size seem pitifully small.  We wanted something to stir  up
the evolutionary process, and we felt it was worth sacrificing genetic stability
if it enabled faster evolutionary progress.  Theoretically, we could reduce  the
rate and run  the experiments for  much longer and  get similar results.   For a
more interesting experiment, we could effect a sort of `evolutionary  annealing'
by starting with high crossover and mutation rates and reducing them over time.  

On a similar  note, our method  of crossover is  very messy, since  positions of
crossover  and the  lengths of  the child  processes can  vary widely.   From a
probabilistic sense however, this is  no great drawback, since it  would predict
that the warriors would  stay in a moderate  range of length and  would each get
some genes from both parents.  In fact, it may have some benefit, since it tends
to  distribute  the  instructions that  are  present  throughout their  possible
positions  in the  warriors.  A  caveat is  that there  is very  little  genetic
stability, since even identical parents could produce very different children.  

Populations for phase~2 experiments consisted of 900 warriors, arranged in a  30
by 30  array.  The  neighborhood for  a warrior  was defined  as the  8 warriors
immediately surrounding it in the array.  For example,

warrior (i,j)'s neighborhood looks like this:

    $(i-1,j-1)$  &   $(i-1,j)$   &   $(i-1,j+1)$
    $(i,j-1)$    &    $(i,j)$    &    $(i,j+1)$
    $(i+1,j-1)$  &   $(i+1,j)$   &   $(i+1,j+1)$
      
The edges of the  space wrap around in  a torus, so warrior  (1,15) has warriors
(30,14),  (30,15),  and (30,16)  as  three of  its  neighbors. Reproduction  for
phase~2  experiments  uses the  same  basic algorithm,  but  only one  child  is
produced from each pairing, and the code that would have been the other child is
thrown  away.   The  first  parent  to  be  copied  from  is  selected randomly.
Prospective  parents  for  child  (i,j) are  chosen  from  the  warriors in  the
neighborhood of (i,j) from the  previous generation, rather than from  the whole
population.   This means  that there  is no  global homunculus,  but rather  all
interaction is local.


Results

In each phase, we conducted several experiments of 10 to 15 generations. Results
of  a typical  run of  a phase~1  experiment are  presented in  table~1.  These
experiments indicate that it is possible to evolve predatory strategies, even in
small populations and in a small number of generations.  The data shows that the
number of wins  and ties by  a whole population  of warriors generally  increase
with each  generation.  The  ratio of  wins-per-battle-fought is  proof that the
predatory skills of each generation is better than the last.  In some cases, the
number of wins increased at the cost  of the number of ties. This would  seem to
indicate that  survival and  predation are  competing strategies  in Core  Life,
which  is  realistic  since any  sort  of  attacking behavior  has  a  chance of
backfiring on the attacker, be it biological or binological.  A related issue in
biology is the fight or flight response.


& wins vs. & wins vs. &   wins per    &    ties per    & multi-battle
  Generation  # & QUARTER2 &  WANG1   &  battle fought & battle fought & winners
         0      &     2    &     3    &    0.2%     &     4.65%   &      1
         1      &    19    &    10    &    1.83%    &     5.1%    &      9
         2      &    59    &     5    &    4.86%    &     8.65%   &     41
         3      &   179    &    16    &    6.92%    &     6.02%   &     76
         4      &   258    &    19    &    11.2%    &     8.68%   &    170
         5      &   332    &    28    &    12.3%    &     5.2%    &    195
         6      &   401    &    28    &    17.6%    &     4.92%   &    292
         7      &   420    &    22    &     8.5%    &    12.3%    &    148
         8      &   480    &    23    &    20.96%   &     5.38%   &    363
         9      &   451    &    16    &    18.66%   &     9.72%   &    300
        10      &   549    &    24    &    21.1%    &     4.88%   &    385

The results from  one phase 1  experiment.  WANG1 and  QUARTER2 were two  of the
human coded programs used as opponents for fitness testing.  The other 3 columns
refer to battles fought against opponents by a particular generation.
 

The fitness  testing for  phase~1 experiments  was done  using a commercial MARS
which graphically displayed the battles during fitness testing. >From  observing
these battles,  it was  apparent that  a sort  of speciation occurred.  Warriors
evolved a variety of strategies  for survival, including bombing through  memory
with  DAT  (or  some  other)  instructions,  infinitely  looping  on  a   single
instruction (this was more a  survival strategy than a predatory  strategy), and
splitting to arbitrary locations in the  arena in an attempt to branch  into the
opponent warrior's code (a form of mimicry).  One complex strategy which evolved
was a species which would bomb with imps\footnote{an imp consists of the  single
instruction ``MOV  $0 $1''.   This instruction,  if executed  by a process, will
copy itself to the next location in  memory.  Then on the next clock cycle,  the
process will execute  that copy and  move itself ahead  again, and so  on.}, and
then kill  the imps  when they  reached its  code.  The  fact that a cooperative
process  like  this  could  evolve   in  so  few  generations  is   fascinating.
Unfortunately, they were  also extinct in  a few generations,  either because of
the  strong evolutionary  force of  the high  crossover rate,  or because  their
strategy, although elegant, wasn't effective.

Two second generation warriors of a  pilot version of this project were  entered
in the 1989 International Core Wars Tournament, and they came in second to  last
and third to last (the  last place finisher had a  bug in his program).  Two  of
the best warriors (one sixth generation and one tenth generation) from an  early
run  of phase~1  were entered  in the  1990 ICWT  and did  considerably better,
finishing 12th and 16th out of 26 finalists.  The fact that these warriors  were
able to beat human- coded opponents is impressive, the additional fact that they
did  so  in  only  a  few  generations  is  suggestive.   A  properly controlled
experiment which  ran for  hundreds, or  even thousands  of generations,  should
stand a good chance of winning the tournament.  Granted, this is not nearly on a
par with beating Karpov at chess, but  still it would be an impressive notch  on
the genetic algorithm's belt.

The first runs of phase~2 experiments are currently being generated
and analyzed.  Since they weren't fitness tested against human-coded
warriors as in phase~1, the fitness scores aren't as valid an
indicator of their progress.  As an initial analysis technique, we've
taken some of the champion warriors from each generation and run them
against some of the same warriors that were used in fitness testing
in phase~1.  Unfortunately, we didn't get this data ready in time
for presentation in this initial report, but we can comment on
observed results.

The progress isn't nearly as steady  as in the phase~1 experiments. Two  factors
could account for this.  First, all phase~2 warriors which get any fitness score
other than 0 have a chance to  reproduce, unlike phase~1 where only the top  10%
of  the warriors  were allowed  to reproduce.   Second, the  data from  phase~1
accounts for all  of the warriors  in each generation,  whereas the scores  from
phase~2  only account  for a  few warriors  which happened  to get  the highest
fitness scores.  Thus, in phase~1, there were more chances for mediocre warriors
to win or tie through advantageous starting locations.  It is important to  note
that  despite  these handicaps,  some  survival and  predation  strategies still
evolved.  We  feel  that  this supports  the  validity  of  our conclusion  that
binological predation is an evolvable  phenomenon.  We are hoping that  warriors
of future generations of phase~2 experiments will reach (and hopefully  surpass)
the complexity level of phase~1 warriors.

Two pieces  of data  which we  are collecting  in phase~2  experiments which  we
ignored in phase~1 experiments are the  fitness scores for each warrior and  the
average battle length for each warrior. We feel that this data may be useful  in
analyzing  the dynamics  of evolutionary  change over  the whole  population of
warriors.  For example, GAs have a tendency to find stable points, where much of
the population becomes fairly homogeneous.  At this point, we would expect a lot
of ties,  so the  fitness scores  should be  lower, and  more evenly spread out.
When one warrior gains some  new ability, through either mutation  or crossover,
we would expect its score to be much higher for a few generations until it could
spread  this  feature through  reproduction.   Next, we  would  expect this  new
feature to slowly  ripple throughout the  space until it  was fully distributed.
The interesting point to look for is initial spike in the fitness landscape,  as
this indicates where the productive mutation came about.


Conclusions/Discussion

There are a wide range of possibilities for further experiments on this subject.
Currently, we  are working  on the  suggested fitness  landscape maps,  and also
planning to do several other experiments. The obvious ones that come to mind are
variations in  the basic  parameters of  the system  like mutation and crossover
rates, size of  population, number of  generations, neighborhood structure,  and
methods of  generating the  initial random  warriors.  A  more ambitious project
would be to graph family trees of warriors, thus tracing their genetic progress.
A slightly further deviation from this  project would be to evolve programs  for
other tasks and in other instruction sets, and in fact this has been done  (more
or less), but not on tasks that are distinctly binological in nature.


The Bibliography

William R. Buckley, The Rising Tide of Computational
       Vulnerability, IEEE Society on the Social Implications of
      Technology, Conference on Technics, Culture and Consequences,
      California State University, Los Angeles, October 20--21, 1989.

William R. Buckley, A Core War Model of Computational
      Evolution, International Society for the Systems Sciences, 
      Proceedings of the 34th Annual Meeting, Portland, Oregon, 
      July 8--13, 1990.

Collins, Robert J.  Tracker.  Forthcoming in the 
      Proceedings of the Second Workshop on Artificial Life, 
      Santa Fe, 1990.

Dawkins, Richard.  The Selfish Gene.  
      Oxford University Press, Oxford, 1976.

David J. Depew & Bruce H Weber (eds.). Evolution at a
      Crossroads.  The MIT Press, Cambridge, 1985.

Dewdney, A.K.  In the Game Called Core War Hostile Programs 
      Engage in a Battle of Bits.   From Scientific American, 
      May, 1984.

Goldberg, David.  Genetic Algorithms in Search, 
      Optimization, and Machine Learning.  Addison-Wesley, New York, 1989.

Holland, John H.  Adaptation in Natural and Artificial
      Systems.  The University of Michigan Press, Ann Arbor, 1975.

Langton, Chris.  Studying Artificial Life with Cellular
     Automata.  From Physica, pp 120--149, 1987.

Perry, John R.  Predatory Programming.  From The Core Wars
      Newsletter, Fall, 1988.

Rasmussen, Steen.  The Coreworld: Emergence and Evolution 
      of Cooperative Structures in a Computational Chemistry.  Forthcoming in
      the Proceedings of the Second Workshop on Artificial Life,
      Santa Fe, 1990.

Stork, David G. Non-Optimality via Preadaptation in Simple 
      Neural Systems.  Forthcoming in Proceedings of the Second Workshop 
      on Artificial Life, Santa Fe, 1990.

The Core War Standard of 1988.
      International Core Wars Society, 1988.


Definitions

[Binology]   The  study  of  computer-based  complex  systems  which demonstrate
lifelike characteristics.

[Computer Virus]  A computer virus  is a program which effects  its distribution
by attaching itself to the executable  code of another program, called a  vector
program.