Steven C. Morrell : Mathematical Models for Step Sizes


Steven C. Morrell : Mathematical Models for Step Sizes

 morrell@math.utah.edu
 10 Apr 1996 05:29:54 -0400
 Storm King Ind. Inc.

Abstract: Corewar is  a programming game  played on a  board simulating computer
memory.  This board is called the core, and consists of N cells, each containing
an assembly language command.   Programs run simultaneously, with  the objective
of killing other programs in core and thus becoming the sole owner of the  core.
Addition, subtraction, and multiplication give  the core a Z/NZ ring  structure.
N is called the coresize.  Popular values of Nare 8000, 8192, and 55440.

In this article,  we examine ways  to optimize simple  bombing or scanning  step
sizes.   We  give formulas  to  measure 'optimization'  of  a step  for  a given
coresize, and prove that imp-pairs have the same optima score.

Contents:

	(0) Terminology
	(1) Euclid's Algorithm and Continued Fractions
	(2) Some Models for Bombing Core (incomplete)
	(3) A Better Formula for the Length of Intervals
	(4) The Optima Step Formula
	(5) The Find-X Formula (omitted in current draft)
	(6) Locally Fibonacci Step-Sizes (omitted in current draft)
	(7) Calculations (omitted in current draft)


(0) Terminology

We denote subscripts and  superscripts in a TeX-ish  way.  So "X sub  1" will be
written X_1, and X^2 = X*X.  If  we need to perform math on subscripts,  we will
use curly braces to denote groupings.

We also use the following symbols:

  Z+ : The set of all positive integers.
  (A,B) : The greatest common factor of positive integers A,B.
  [x] : The integral part of a non-negative real number x.
  A|B : The integer A divides the integer B evenly (that is, without remainder.)


(1) Euclid's Algorithm and Continued Fractions

In this section,  we build the  mathematical framework for  the results of  this
article.  We explore the  relationship betwwen Euclid's Algorithm  and continued
fractions, and prove a  classic result of Euler,  who seems to have  anticipated
Corewar in 1764.

Recall that Euclid's Algorithm computes (A,B), with integers A>B>0, as follows:

  Let X_0 = A and X_1 = B.

  Inductively define
                       X_{i+1} = X_{i-1} - X_i*Y_i,           (1)
  with
                       Y_i=[X_{i-1}/X_i].                     (2)

  Clearly, 0 <= X_{i+1} < X_i.   If X_{i+1}=0, our induction must stop  (because
  Y_{i+1} is undefined).  In this case, define  k to be the largest i for  which
  X_i is non-zero.   Since X_{i+1}0.  Hence, X_k  and
  (A,B) have the same factors, so X_k=(A,B).


Definition:  Given A_1,A_2,...,A_n in Z+,   we  define  the  continued  fraction 
generated by the A_i's as

        = A_1+1/(A_2+1/...(A_{n-1}ð_n)...) (3)

  We define  the length  of   to be  n.  Clearly,   is
  always rational.  We define the numerator of  to be p, where  p/q
  is   the   reduced  fraction   with   p/q=  and   denote   it  by
  p=[A_1,...,A_n].  We allow "empty" numerators, defining []=1.


Propositon 1: Given A,B in Z+ with A>B, we have

                     A/B = 

  where the Y_i's and k are defined by Euclid's algorithm using (1) and (2).

Proof: We the  notations of Euclid's  algorithm above and  induction on k.  When
k=1, we have X_2=0 (by definition of k), so we know by (1) with i=1 that

                    0 = X_0-Y_1*X_1.
Hence,
                     = Y_1 = X_0/X_1.

Now, suppose k>1.  Note that X_1>X_2>0,  so we can use Euclid's algorthm  on A'=
X_1, B'=  X_2 to  find each  Y'_i, and  it is  easy to  see that Y'_i = Y_{i+1}.
Hence, for induction we may assume that

                    X_1 / X_2 = .

We then have
                    = Y_1+1/
                                     = Y_1+1/(X_1/X_2)
                                     = (Y_1 * X_1 + X_2)/X_1
                                     = X_0/X_1
                                     = A/B.
[box]


Corollary 2: If we are given the Y_i's and m=(A,B), we can recover A and B.

Proof:   is  rational,  hence can be uniquely written as a reduced
fraction a/b.

  Existence:  Since this fraction is reduced, we have (a,b)=1.
  Define A= ma, B= mb.  Then A/B=a/b= and (A,B)=m*(a,b)=1.

  Uniqueness: If A/B=A'/B'= and (A,B)=(A',B')=m then A/m,B/m are in
  Z+ and (A/m,B/m)=1, so (A/m)/(B/m)  is the reduced fraction a/b.  Hence A/m=a,
  B/m=b.  Similary, A'/m=a, B'/m=b.  Thus A=A' and B=B'.
  
[box]


Corollary 3: Given A,B in Z+ with A>B and (A,B)=1, we have

                     A = [Y_1,...,Y_k],
                     B = [Y_2,...,Y_k]

  where the Y_i's and k are defined by Euclid's algorithm using (1) and (2).

Proof: From Proposition 1, we know A/B=.  This fraction is
  reduced when (A,B)=1, so A = [Y_1,...,Y_k] by definition.  If k=1, then
  B=1=[].
  
  Now if k>1 and C/D is the reduced fraction , then we have

                 A/B= Y_1 + 1/(C/D)
                    = (Y_1*C + D)/C.

  The fractions on both sides of this equation are reduced, hence B=C.  But
  C = [Y_2,...,Y_k] by definition.
[box]


Corollary 4: For any continued fraction  with n>=2, the
  following identity holds:

         [A_1,...,A_n] = A_1*[A_2,...,A_n] + [A_3,...,A_n].    (4)
             
Proof: If A/B= and C/D= are both reduced
  fractions, then A/B= (A_1*C + D)/C, as in the proof of Corollary 3.
  Multiplying both sides by B=C, we obtain:
  
                        A = A_1*C + D.
  
  From Corollary 3, we have A = [A_1,...,A_n], C = [A_2,...,A_n], and
  D = [A_3,...,A_n].
[box]


Examples: 43/10 = <4,3,3>.  Also, 43/10 = <4,3,2,1>, even though these
  are not the Y_i defined in (2).

  43/30 = <1,2,3,4>.

  If (A,B)=1 and A/B = <1,1,...,1> (k times), then A=F_{k+1} and B=F_k,
  where F_i is the i'th term in the Fibonacci sequence {1,1,2,3,5,8,13,...}.


We now come to the main result of this section, proved by Euler in 1764.

Theorem 5: For any continued fraction , the following
  relations hold:

           [A_1,...,A_n]=[A_n,...,A_1]                         (5)

          [A_2,...,A_n] * [A_1,...,A_{n-1}] =
             [A_1,...,A_n] * [A_2,...,A_{n-1}] - (-1)^n  (n>1) (6)

Proof: We use induction on n.  When n=0 or n=1, (5) is tautological.
  And for n=2, we have, by (4),

                      [A_1,A_2]= A_1 * [A_2]+ []
                               = A_1 * A_2 + 1.

  Hence, [A_2] * [A_1] = A_2 * A_1
                       = (A_1 * A_2 + 1)*1 -1
                       = [A_1,A_2]*[]-(-1)^2,
  
  so (6) holds.

  For n>=2, we use (4) and induction on (5) repeatedly to
  get:

     [A_1,...,A_2] = A_1 * [A_2,...,A_n] + [A_3,...,A_n]
                    = A_1 * [A_n,...,A_2] + [A_n,...,A_3]
                    = A_1 * (A_n * [A_{n-1},...,A_2] + [A_{n-2},...,A_2])
                          +  A_n * [A_{n-1},...,A_3] + [A_{n-2},...,A_3]
                    = A_1 * A_n * [A_2,...,A_{n-1}]
                          + A_1 * [A_2,...,A_{n-2}]
                          + A_n * [A_3,...,A_{n-1}]
                          +       [A_3,...,A_{n-2}].

  If we replace (A_1,...,A_n) with (A_n,...,A_1), the first and fourth
  terms of the right-hand expression stay the same, and the second and
  third terms get swapped.  Thus,

                   [A_1,...,A_n] = [A_n,...,A_1]

  since they both equal the same expression.  This proves (5) for all n>0.

  Finally, for n>=3, we can assume by induction that (6) holds for
  .  In other words,

   [A_2,...,A_{n-1}]*[A_3,...,A_n}] = [A_3,...,A_{n-1}]*[A_2,...,A_n]
                                      -(-1)^{n-1}.

  Thus,

   [A_1,...,A_n]*[A_2,...,A_{n-1}]
     = (A_1*[A_2,...,A_n]+[A_3,...,A_n]) *[A_2,...,A_{n-1}]
     = A_1*[A_2,...,A_n]*[A_2,...,A_{n-1}] +[A_3,...,A_n]*[A_2,...,A_{n-1}]
     = A_1*[A_2,...,A_n]*[A_2,...,A_{n-1}]
                             +[A_3,...,A_{n-1}]*[A_2,...,A_n]+(-1)^n
     = [A_2,...,A_n]*(A_1*[A_2,...,A_{n-1}]+[A_3,...,A_{n-1}])+(-1)^n
     = [A_2,...,A_n]*[A_1,...,A_{n-1}]+(-1)^n.

  This is (6) for , so by induction, (6) holds for all n>1.
[box]


Corollary 6: Given positive integers A,B, with A/B = ,
  define positive integers A',B' by

               A'/B' =                   (7)
  where
              (A',B')=(A,B)                                   (8)

  Then the following relationships hold:

               A' = A;                                        (9)

               B'*B = -(-1)^k*(A,B)^2 (mod A).                (10)

Proof:  Let m=(A,B).  Then (A/m)/(B/m) is the reduced form of A/B, whence

               A = m*[Y_1,...,Y_k],
               B = m*[Y_2,...,Y_k].

  Similarly,
  
               A' = m*[Y_k,...,Y_1],
               B' = m*[Y_{k-1},...,Y_1].
  
  Then (9) follows from (5), and (6) can be written as:
  
               (B/m)*(B'/m)=(A/m)*[Y_2,...,Y_{k-1}]-(-1)^k.
  
  We multiply both sides by m^2 to get
  
               B*B'=A*m*[Y_2,...,Y_{k-1}]-(-1)^k*m^2.
  
  Then (10) follows from passing to congruence (mod A).
[box]


(2) Some Models for Bombing Core

Let N be the coresize  and let KZ/NZ given by  f(n)=n+N. Rather
than mess with cosets, we will  refer to elements of core by  representatives in
Z.  For a0, let U_j be the interval containing 0 in the partition after
j bombs.


Theorem 7: For all j with 0=0 is in the partition.  In the first case, we have

       ||[a,b)||=||[d+g*K,e+G*K)||=e-d=||[d,e)||,

  so we  are done.   Otherwise, since  d is  in the  bombing run,  we know  that
  d=l*K for some l with 1(3) A Better Formula for the Length of Intervals

Here is some terminology for the next result.

Let X_0=N, X_1=K, and define X_i and Y_i using Euclid's algorithm (that is,
equations (1) and (2)).  Further, define Z_i inductively by Z_0=0, Z_1=1,
and
           Z_{i+1}=Z_i*Y_i+Z_{i-1}.                           (11)

By proposition 1, we have X_0/X_1=(X_0/m)/(X_1/m)=.  Applying
Corollary 3 to X_0/m,X_1/m, we get 

           X_0= m*[Y_1,...,Y_k]
           X_1= m*[Y_2,...,Y_k]

and inductively using Corollary 4 with equation (1), we get

           X_i= m*[Y_{i+1},...,Y_k]      (0<=i<=k).           (12)

Inductively using Corollary 4 with equation (11), we also get

           Z_i= [Y_{i-1},...,Y_1]        (1<=i<=k+1).         (13)

Now we can describe the bombing sequence as a series of bombing runs by
looking at the intervals U_i containing 0.

Definition:
  The i'th  bombing run  for i=1 and  X_{i-1}-p*X_i>= 0,  whch
  means p<=Y_i.

  We have by hypothesis that the first bomb in the i'th bombing run is   dropped
  at X_{i-1}-X_i and shrinks U_j.  So, what  is the gap in time between a  bombs
  being  dropped at  X_{i-1}-(p-1)*X_i and  X_{i-1}-p*X_i? Since  all bombs  are
  evenly spaced, this gap will be the same for all p, and with p=1, we know that
  the Z_{i-1}'th bomb  dropped at X_{i-1}  and the Z_{i-1}+Z_i'th  bomb drops at
  X_{i-1}-X_i.  Thus the value of p increases every Z_i bombs.  We know that the
  value of p starts at one with  the Z_i+Z_{i-1}'th bomb, so we can verify  that
  the p given in the statement of the theorem is correct.

  Next, we  must verify  that the  bombs we  just dropped  were the i'th bombing
  run.  We  started with  bomb Z_i+Z_{i-1},  and dropped  Z_i bombs  for each  p
  between  1 and  Y_i (except  for the  last bombing  run, where  p only  ranged
  between 1 and Y_i-1).   Thus, the last bomb  in this bombing was  the Z_i+Z_{i
  -1}+Y_i*Z_i-1=Z_i+Z_{i+1}-1'th bomb (or Z_{k+1}-1=N/m-1 for the last run).

  From the  last two  paragraphs, with  p=Y_i+1, we  see that  the first bomb of
  the next bombing run is  dropped at X_{i-1}-(p+1)*X_i= X_{i+1}-X_i.  And  with
  p=Y_i, we have the Z_{i+1}'th bomb is the Z_i*Y_i+Z_{i-1}'th bomb, which drops
  at X_{i-1}-X_i*Y_i= X_{i+1}.  This finishes the induction
  step.
[box]


(4) The Optima Step Formula

In Corewar,  the step  size of  a bombing  or scanning  pattern is  the distance
between successive bombing or scanning  locations.  Chosing a good step  size is
critical  to the  success of  a warrior,  so we  need a  way of  measuring the
effectiveness of a step size.

What makes one  step size better  than another?  Here  is one train  of thought:
Larger enemies can usually be found faster than small ones.  If your step  takes
the same  time to  find small  enemies as  large ones,  it probably  needs to be
optimized against larger enemies.  On the  other hand, if your step size  is too
coarse, small programs can slip through the cracks.  For instance, assuming 8000
instructions in  core, a  step size  of 4000  is bad.   The first  two bombs are
spaced well, but  the pattern only  hits two core  locations.  An ideal  pattern
will divide the core into successively smaller chunks, until it has checked  all
locations in core (except where your code resides).

We can measure the success of this  method by adding the lengths of the  largest
untouched segments during different stages  of bombing.  This sum is  the optima
function of a coresize and step.  A lower function value between step-sizes with
the same greatest common factor indicates a more optimal pattern.


Definition: The optima function is defined to be:

            O(N,K)= sum_j( ||U_j|| ).

Corollary 8 assures that this is the sum we want.


Proposition 10: O(N,K) = O(N,N-K).

Proof:
  In coresize N, we can consider a step-size of N-K as a step-size of K  bombing
  in the other direction.   If the U_j are  the intervals containing 0  for step
  -size K,  as above,  then, by  symmetry, the  intervals U'_j  containing 0 for
  step-size N-K will  just be the  U_j reversed; that  is, if for  some j, U_j=[
  -a,b), then U'_j=[-b,a).  Note that

              ||U_j|| = a+b = ||U'_j||.

  We sum over all j to get the result.
[box]


Theorem 11: 
    O(N,K)= sum_{i=1}^k(X_{i-1}*Z{i+1}-X_i*Z_i-X_i*Z_i*(Y_i*(Y_i-1)/2)

Proof: We write O(N,K)= sum_{i=1}^k O_i(N,K), where each O_i is the sum of
  the ||U_j|| from the i'th bombing run.  In the i'th bombing run, we have
  (by theorem 9) Z_i intervals of length X_{i-1}-(p-1)*X_i, for p ranging
  from 1 to Y_i.  Adding these up, we think that
  
           O_i(N,K)= sum_{p=1}^{Y_i} Z_i*(X_{i-1}-(p-1)*X_i)
                   = sum_{p=1}^{Y_i}(Z_i*X_{i-1})
                       - sum_{p=1}^{Y_i}(Z_i*X_i*(p-1))
                   = Y_i*Z_i*X_{i-1} - Z_i*X_i*(Y_i*(Y_i-1)/2)
  
  Well, not quite.  We forgot that the last bombing run only uses the
  values of p from 1 to Y_k-1.  Thus we must add the correction term
  
           -Z_k*(X_{k-1}-(Y_k-1)*X_k)= -Z_k*(X_{k-1}-Y_k*X_k+X_k)
                                     = -Z_k*(X_{k+1}+X_k)
                                     = -Z_k*X_k

  since X_{k+1}=0.  We thus have

      O(N,K)=                                                 (15)
         sum_{i=1}^k( Y_i*Z_i*X_{i-1} - Z_i*X_i*(Y_i*(Y_i-1)/2) ) -Z_k*X_k.


  We also have the following telescoping sum:

         sum_{i=1}^k( X_i*Z_i - X_{i-1}*Z_{i-1})
                      = X_k*Z_k - X_0*Z_0
                      = X_k*Z_k

  since Z_0=0.  Substituting this into (15), we get
  
      O(N,K)=                                                 (16)
         sum_{i=1}^k( Y_i*Z_i*X_{i-1} - Z_i*X_i*(Y_i*(Y_i-1)/2) - X_i*Z_i
                      + X_{i-1}*Z_{i-1}).

  and the first and fourth terms of the sum combine thusly:
  
     Y_i*Z_i*X_{i-1} + X_{i-1}*Z_{i-1} = X_{i-1}*(Y_i*Z_i+Z_{i-1})
                                       = X_{i-1}*Z{i+1}.

  to yield the formula we wished to prove.
[box]


Corollary 12: If {K,K''} is an imp-pair for coresize N, then

                        O(N,K)=O(N,K'').

Proof:
  Recall that {K,K''} is an imp-pair if and  only if K*K'' = 1 (mod N). For  any
  given K,N, we know that  K'' exists if and only  if (K,N)=1, and we know  that
  K'' is unique.

  Construct K' and N' as in corollary 6.  Then we have

                     N=[Y_1,...,Y_k],
                     K=[Y_2,...,Y_k],
                     N'=[Y_k,...,Y_1],
                     K'=[Y_{k-1},...,Y_1].

  If we define X'_i,Y'_i, and Z'_i by X'_0=N',X'_1=K' and the recurrence
  relations in section 3, we get the follwing for all applicable i:
  
                     Y'_i=Y_{k+1-i}
                     X'_i=Z_{k+1-i}
                     Z'_i=X_{k+1-i}.

  If we apply theorem 11 and reverse the order of summation, we get
  
    O(N',K')= sum_{i=1}^k(X'_{i-1}*Z'{i+1}
                             -X'_i*Z'_i
                             -X'_i*Z'_i*(Y'_i*(Y'_i-1)/2)
            = sum_{i=1}^k(X'_{k+1-(i-1)}*Z'_{k+1-(i+1)}
                       -X'_{k+1-i}*Z'_{k+1-i}
                       -X'_{k+1-i}*Z'_{k+1-i}*(Y'_{k+1-i}*(Y'_{k+1-i}-1)/2)
            = sum_{i=1}^k(Z_{i+1}*X_{i-1}
                             -Z_i*X_i
                             -Z_i*X_i*(Y_i*(Y_i-1)/2)
            = O(N,K).
  
  Now, by corollary 6, we have N'=N and K*K'= +/- 1 (mod N), so by uniqueness of
  K'', we  have either  K''=K' (mod  N) or  K''=-K' (mod  N). In the first case,
  O(N,K'')=O(N',K'), and  in the  second case,  O(N,K'')= O(N,N-K')=O(N',K')  by
  proposition 10.
  
[box]