John Perry : More on Core Wars Genetics
Tue, 21 Apr 92 09:06:19 -0700 Hi again, As promised, I finally found a later revision of the paper in LaTeX format. Much of it was just a rewrite of the original term paper, but when it gets to the methods and conclusions there is some new information. The experiments described in the original paper are referred to as "phase 1" in this paper. For brevity's sake, I've left out the sections of this which were rewrites of sections of the other paper, and I've indicated as such in the paper. I also found the code, but I have to tease apart the part of the code which is mine from the part of the code which is source code for Buckley's Core War Colliseum, since I don't have copyrights to that code. It should be out in a couple of weeks. Would one of you be so kind as to let me know where to place an FTP upload on soda? That way, when I get the code ready, I can put it and both papers in an archive or zip file and load it all up as one file to soda. Anyway, here's the paper. As I said it's in LaTeX format. If you don't have LaTeX, send me a note back and I'll send you a copy in postscript format. Later, John This paper documents an ongoing research effort in the field of artificial life. We demonstrate the evolution of predatory behavior in computer programs. The game Core War provides an environment for fitness testing, with evolution implemented by a genetic algorithm. We refer to the union of GAs and Core Wars as the Core Life environment. Results presented here show how random code can evolve into successful Core War programs in only a few generations.Introduction
I left this intro out, since it was basically a rewite of the intro to the other paper.Description of the Core Life Model
I left this intro out, since it was basically a rewite of the ``Core Wars as a Binological Model'' section of the other paper.Methods
We actually used two methods to fitness test warriors. In the initial experiments, all of the warriors of a population battled against common opponents. These opponents were warriors (written by humans) which had been entered in previous Core Wars tournaments. Both successful and unsuccessful warriors were chosen so that progress might be seen in the form of the genetic warriors defeating better and better opponents. Each warrior in a generation fought the same number of battles against the same opponents, but with random starting locations for each battle. In this way average warriors could still score highly if they happened to get advantageous starting positions\footnote{We felt that this normalization factor emulates the real world very well, since the very strong and the very smart aren't always the most successful organisms. For example, it is possible for a fairly mediocre individual to become U.S. Vice President}. The top 10% of warriors in each generation were selected as champions, and thus allowed to reproduce. This method of fitness testing turned out to be very successful, but perhaps unrealistic since the opponents provided a sort of evolutionary goal. This is not to say that biological life forms never deal with man-made antagonists, but rather that the ideal competition for evolving life forms are its sibling life forms. Thus we derived the other mode of fitness testing, whereby each warrior battled against its neighboring warriors. This eliminated the artificial goal, although there were still the inherent goals of survival and reproduction. Since success in the MARS was tied to reproduction, we predicted that we would still see evolutionary progress toward predatory behavior, but at a slower rate, since the motivating factors weren't quite as overt. For clarity, we will refer to these sets of experiments as phase~1 (human-generated opponents) and phase~2 (sibling opponents). Most of the methods were common between the two phases. In our experiments, genetic manipulation was done at two levels: crossover at the instruction level, and mutation at the bit level. Mutation was done at the bit level to assure that the entire space of possible warrior programs was reachable. Crossover is a much more frequent operation, however, and Redcode assembly language isn't particularly robust (for example it has a 4 bit opcode field, but only 10 legal instructions). We felt that if crossover was done at the bit level, there would be too high a percentage of non-functional warriors in each generation. By performing crossover at the instruction level, it was still possible to produce non-functional warriors with this genetic operator, but not invalid instructions. In a way, instructions make more sense as genes than bits, since they carry more information. In each generation, the entire population was replaced by reproducing with a genetic algorithm. A biased roulette wheel method was used to select parents from the champions of the previous generation, with each warrior given a percent chance of reproduction corresponding to its total fitness score. In phase~1, each mating of two warriors produced two progeny in the following manner: First, instructions are copied one-by-one from one of the parents to one of the children. As each instruction is copied, there is a random chance (.05%) that mutation is performed. Mutation consists of flipping one of the 40 bits in the instruction (randomly selected). After each instruction is copied there is a random chance (25%) of a crossover. After the first crossover occurs, instructions are copied from the other parent to the other child. Again, each instruction is tested for mutation and after each instruction is copied there is a random chance of crossover. After the next crossover, instructions are copied from the second parent to the first child, one-by-one, with mutation and crossover tested for as before. Next, instructions are copied from the first parent to the second child, with mutation and crossover tested for. When crossover is chosen at this point, it goes back to the first parent being copied to the first child. This process continues until all of the instructions from both of the parents are copied to the children. Finally, the lengths of the two children are determined and a random starting address is selected for each. This method allowed multiple crossovers at random locations in the code. Since the maximum warrior length we allowed was 64 instructions, we would expect an everage of about $64 \times .25 = 16$ crossovers per mating in the longest warriors. The reason we used such a high crossover rate is strictly computational. Due to space limitations, our population sizes were 1000 warriors per generation for phase~1, and 900 per generation for phase~2. Since the maximum warrior size for these experiments was 64 instructions, and Redcode machine instructions are 40 bits long, the maximum number of bits per warrior was 2560. This means the size of the space being searched is $ 2^{2560} \approx 4.33 \times 10^{769} $, which makes the population size seem pitifully small. We wanted something to stir up the evolutionary process, and we felt it was worth sacrificing genetic stability if it enabled faster evolutionary progress. Theoretically, we could reduce the rate and run the experiments for much longer and get similar results. For a more interesting experiment, we could effect a sort of `evolutionary annealing' by starting with high crossover and mutation rates and reducing them over time. On a similar note, our method of crossover is very messy, since positions of crossover and the lengths of the child processes can vary widely. From a probabilistic sense however, this is no great drawback, since it would predict that the warriors would stay in a moderate range of length and would each get some genes from both parents. In fact, it may have some benefit, since it tends to distribute the instructions that are present throughout their possible positions in the warriors. A caveat is that there is very little genetic stability, since even identical parents could produce very different children. Populations for phase~2 experiments consisted of 900 warriors, arranged in a 30 by 30 array. The neighborhood for a warrior was defined as the 8 warriors immediately surrounding it in the array. For example, warrior (i,j)'s neighborhood looks like this: $(i-1,j-1)$ & $(i-1,j)$ & $(i-1,j+1)$ $(i,j-1)$ & $(i,j)$ & $(i,j+1)$ $(i+1,j-1)$ & $(i+1,j)$ & $(i+1,j+1)$ The edges of the space wrap around in a torus, so warrior (1,15) has warriors (30,14), (30,15), and (30,16) as three of its neighbors. Reproduction for phase~2 experiments uses the same basic algorithm, but only one child is produced from each pairing, and the code that would have been the other child is thrown away. The first parent to be copied from is selected randomly. Prospective parents for child (i,j) are chosen from the warriors in the neighborhood of (i,j) from the previous generation, rather than from the whole population. This means that there is no global homunculus, but rather all interaction is local.Results
In each phase, we conducted several experiments of 10 to 15 generations. Results of a typical run of a phase~1 experiment are presented in table~1. These experiments indicate that it is possible to evolve predatory strategies, even in small populations and in a small number of generations. The data shows that the number of wins and ties by a whole population of warriors generally increase with each generation. The ratio of wins-per-battle-fought is proof that the predatory skills of each generation is better than the last. In some cases, the number of wins increased at the cost of the number of ties. This would seem to indicate that survival and predation are competing strategies in Core Life, which is realistic since any sort of attacking behavior has a chance of backfiring on the attacker, be it biological or binological. A related issue in biology is the fight or flight response. & wins vs. & wins vs. & wins per & ties per & multi-battle Generation # & QUARTER2 & WANG1 & battle fought & battle fought & winners 0 & 2 & 3 & 0.2% & 4.65% & 1 1 & 19 & 10 & 1.83% & 5.1% & 9 2 & 59 & 5 & 4.86% & 8.65% & 41 3 & 179 & 16 & 6.92% & 6.02% & 76 4 & 258 & 19 & 11.2% & 8.68% & 170 5 & 332 & 28 & 12.3% & 5.2% & 195 6 & 401 & 28 & 17.6% & 4.92% & 292 7 & 420 & 22 & 8.5% & 12.3% & 148 8 & 480 & 23 & 20.96% & 5.38% & 363 9 & 451 & 16 & 18.66% & 9.72% & 300 10 & 549 & 24 & 21.1% & 4.88% & 385 The results from one phase 1 experiment. WANG1 and QUARTER2 were two of the human coded programs used as opponents for fitness testing. The other 3 columns refer to battles fought against opponents by a particular generation. The fitness testing for phase~1 experiments was done using a commercial MARS which graphically displayed the battles during fitness testing. >From observing these battles, it was apparent that a sort of speciation occurred. Warriors evolved a variety of strategies for survival, including bombing through memory with DAT (or some other) instructions, infinitely looping on a single instruction (this was more a survival strategy than a predatory strategy), and splitting to arbitrary locations in the arena in an attempt to branch into the opponent warrior's code (a form of mimicry). One complex strategy which evolved was a species which would bomb with imps\footnote{an imp consists of the single instruction ``MOV $0 $1''. This instruction, if executed by a process, will copy itself to the next location in memory. Then on the next clock cycle, the process will execute that copy and move itself ahead again, and so on.}, and then kill the imps when they reached its code. The fact that a cooperative process like this could evolve in so few generations is fascinating. Unfortunately, they were also extinct in a few generations, either because of the strong evolutionary force of the high crossover rate, or because their strategy, although elegant, wasn't effective. Two second generation warriors of a pilot version of this project were entered in the 1989 International Core Wars Tournament, and they came in second to last and third to last (the last place finisher had a bug in his program). Two of the best warriors (one sixth generation and one tenth generation) from an early run of phase~1 were entered in the 1990 ICWT and did considerably better, finishing 12th and 16th out of 26 finalists. The fact that these warriors were able to beat human- coded opponents is impressive, the additional fact that they did so in only a few generations is suggestive. A properly controlled experiment which ran for hundreds, or even thousands of generations, should stand a good chance of winning the tournament. Granted, this is not nearly on a par with beating Karpov at chess, but still it would be an impressive notch on the genetic algorithm's belt. The first runs of phase~2 experiments are currently being generated and analyzed. Since they weren't fitness tested against human-coded warriors as in phase~1, the fitness scores aren't as valid an indicator of their progress. As an initial analysis technique, we've taken some of the champion warriors from each generation and run them against some of the same warriors that were used in fitness testing in phase~1. Unfortunately, we didn't get this data ready in time for presentation in this initial report, but we can comment on observed results. The progress isn't nearly as steady as in the phase~1 experiments. Two factors could account for this. First, all phase~2 warriors which get any fitness score other than 0 have a chance to reproduce, unlike phase~1 where only the top 10% of the warriors were allowed to reproduce. Second, the data from phase~1 accounts for all of the warriors in each generation, whereas the scores from phase~2 only account for a few warriors which happened to get the highest fitness scores. Thus, in phase~1, there were more chances for mediocre warriors to win or tie through advantageous starting locations. It is important to note that despite these handicaps, some survival and predation strategies still evolved. We feel that this supports the validity of our conclusion that binological predation is an evolvable phenomenon. We are hoping that warriors of future generations of phase~2 experiments will reach (and hopefully surpass) the complexity level of phase~1 warriors. Two pieces of data which we are collecting in phase~2 experiments which we ignored in phase~1 experiments are the fitness scores for each warrior and the average battle length for each warrior. We feel that this data may be useful in analyzing the dynamics of evolutionary change over the whole population of warriors. For example, GAs have a tendency to find stable points, where much of the population becomes fairly homogeneous. At this point, we would expect a lot of ties, so the fitness scores should be lower, and more evenly spread out. When one warrior gains some new ability, through either mutation or crossover, we would expect its score to be much higher for a few generations until it could spread this feature through reproduction. Next, we would expect this new feature to slowly ripple throughout the space until it was fully distributed. The interesting point to look for is initial spike in the fitness landscape, as this indicates where the productive mutation came about.Conclusions/Discussion
There are a wide range of possibilities for further experiments on this subject. Currently, we are working on the suggested fitness landscape maps, and also planning to do several other experiments. The obvious ones that come to mind are variations in the basic parameters of the system like mutation and crossover rates, size of population, number of generations, neighborhood structure, and methods of generating the initial random warriors. A more ambitious project would be to graph family trees of warriors, thus tracing their genetic progress. A slightly further deviation from this project would be to evolve programs for other tasks and in other instruction sets, and in fact this has been done (more or less), but not on tasks that are distinctly binological in nature.The Bibliography
William R. Buckley, The Rising Tide of Computational Vulnerability, IEEE Society on the Social Implications of Technology, Conference on Technics, Culture and Consequences, California State University, Los Angeles, October 20--21, 1989. William R. Buckley, A Core War Model of Computational Evolution, International Society for the Systems Sciences, Proceedings of the 34th Annual Meeting, Portland, Oregon, July 8--13, 1990. Collins, Robert J. Tracker. Forthcoming in the Proceedings of the Second Workshop on Artificial Life, Santa Fe, 1990. Dawkins, Richard. The Selfish Gene. Oxford University Press, Oxford, 1976. David J. Depew & Bruce H Weber (eds.). Evolution at a Crossroads. The MIT Press, Cambridge, 1985. Dewdney, A.K. In the Game Called Core War Hostile Programs Engage in a Battle of Bits. From Scientific American, May, 1984. Goldberg, David. Genetic Algorithms in Search, Optimization, and Machine Learning. Addison-Wesley, New York, 1989. Holland, John H. Adaptation in Natural and Artificial Systems. The University of Michigan Press, Ann Arbor, 1975. Langton, Chris. Studying Artificial Life with Cellular Automata. From Physica, pp 120--149, 1987. Perry, John R. Predatory Programming. From The Core Wars Newsletter, Fall, 1988. Rasmussen, Steen. The Coreworld: Emergence and Evolution of Cooperative Structures in a Computational Chemistry. Forthcoming in the Proceedings of the Second Workshop on Artificial Life, Santa Fe, 1990. Stork, David G. Non-Optimality via Preadaptation in Simple Neural Systems. Forthcoming in Proceedings of the Second Workshop on Artificial Life, Santa Fe, 1990. The Core War Standard of 1988. International Core Wars Society, 1988.Definitions
[Binology] The study of computer-based complex systems which demonstrate lifelike characteristics. [Computer Virus] A computer virus is a program which effects its distribution by attaching itself to the executable code of another program, called a vector program.