Steven C. Morrell : Mathematical Models for Step Sizes
morrell@math.utah.edu
10 Apr 1996 05:29:54 -0400
Storm King Ind. Inc.
Abstract: Corewar is a programming game played on a board simulating computer
memory. This board is called the core, and consists of N cells, each containing
an assembly language command. Programs run simultaneously, with the objective
of killing other programs in core and thus becoming the sole owner of the core.
Addition, subtraction, and multiplication give the core a Z/NZ ring structure.
N is called the coresize. Popular values of Nare 8000, 8192, and 55440.
In this article, we examine ways to optimize simple bombing or scanning step
sizes. We give formulas to measure 'optimization' of a step for a given
coresize, and prove that imp-pairs have the same optima score.
Contents:
(0) Terminology
(1) Euclid's Algorithm and Continued Fractions
(2) Some Models for Bombing Core (incomplete)
(3) A Better Formula for the Length of Intervals
(4) The Optima Step Formula
(5) The Find-X Formula (omitted in current draft)
(6) Locally Fibonacci Step-Sizes (omitted in current draft)
(7) Calculations (omitted in current draft)
(0) Terminology
We denote subscripts and superscripts in a TeX-ish way. So "X sub 1" will be
written X_1, and X^2 = X*X. If we need to perform math on subscripts, we will
use curly braces to denote groupings.
We also use the following symbols:
Z+ : The set of all positive integers.
(A,B) : The greatest common factor of positive integers A,B.
[x] : The integral part of a non-negative real number x.
A|B : The integer A divides the integer B evenly (that is, without remainder.)
(1) Euclid's Algorithm and Continued Fractions
In this section, we build the mathematical framework for the results of this
article. We explore the relationship betwwen Euclid's Algorithm and continued
fractions, and prove a classic result of Euler, who seems to have anticipated
Corewar in 1764.
Recall that Euclid's Algorithm computes (A,B), with integers A>B>0, as follows:
Let X_0 = A and X_1 = B.
Inductively define
X_{i+1} = X_{i-1} - X_i*Y_i, (1)
with
Y_i=[X_{i-1}/X_i]. (2)
Clearly, 0 <= X_{i+1} < X_i. If X_{i+1}=0, our induction must stop (because
Y_{i+1} is undefined). In this case, define k to be the largest i for which
X_i is non-zero. Since X_{i+1}0. Hence, X_k and
(A,B) have the same factors, so X_k=(A,B).
Definition: Given A_1,A_2,...,A_n in Z+, we define the continued fraction
generated by the A_i's as
= A_1+1/(A_2+1/...(A_{n-1}ð_n)...) (3)
We define the length of to be n. Clearly, is
always rational. We define the numerator of to be p, where p/q
is the reduced fraction with p/q= and denote it by
p=[A_1,...,A_n]. We allow "empty" numerators, defining []=1.
Propositon 1: Given A,B in Z+ with A>B, we have
A/B =
where the Y_i's and k are defined by Euclid's algorithm using (1) and (2).
Proof: We the notations of Euclid's algorithm above and induction on k. When
k=1, we have X_2=0 (by definition of k), so we know by (1) with i=1 that
0 = X_0-Y_1*X_1.
Hence,
= Y_1 = X_0/X_1.
Now, suppose k>1. Note that X_1>X_2>0, so we can use Euclid's algorthm on A'=
X_1, B'= X_2 to find each Y'_i, and it is easy to see that Y'_i = Y_{i+1}.
Hence, for induction we may assume that
X_1 / X_2 = .
We then have
= Y_1+1/
= Y_1+1/(X_1/X_2)
= (Y_1 * X_1 + X_2)/X_1
= X_0/X_1
= A/B.
[box]
Corollary 2: If we are given the Y_i's and m=(A,B), we can recover A and B.
Proof: is rational, hence can be uniquely written as a reduced
fraction a/b.
Existence: Since this fraction is reduced, we have (a,b)=1.
Define A= ma, B= mb. Then A/B=a/b= and (A,B)=m*(a,b)=1.
Uniqueness: If A/B=A'/B'= and (A,B)=(A',B')=m then A/m,B/m are in
Z+ and (A/m,B/m)=1, so (A/m)/(B/m) is the reduced fraction a/b. Hence A/m=a,
B/m=b. Similary, A'/m=a, B'/m=b. Thus A=A' and B=B'.
[box]
Corollary 3: Given A,B in Z+ with A>B and (A,B)=1, we have
A = [Y_1,...,Y_k],
B = [Y_2,...,Y_k]
where the Y_i's and k are defined by Euclid's algorithm using (1) and (2).
Proof: From Proposition 1, we know A/B=. This fraction is
reduced when (A,B)=1, so A = [Y_1,...,Y_k] by definition. If k=1, then
B=1=[].
Now if k>1 and C/D is the reduced fraction , then we have
A/B= Y_1 + 1/(C/D)
= (Y_1*C + D)/C.
The fractions on both sides of this equation are reduced, hence B=C. But
C = [Y_2,...,Y_k] by definition.
[box]
Corollary 4: For any continued fraction with n>=2, the
following identity holds:
[A_1,...,A_n] = A_1*[A_2,...,A_n] + [A_3,...,A_n]. (4)
Proof: If A/B= and C/D= are both reduced
fractions, then A/B= (A_1*C + D)/C, as in the proof of Corollary 3.
Multiplying both sides by B=C, we obtain:
A = A_1*C + D.
From Corollary 3, we have A = [A_1,...,A_n], C = [A_2,...,A_n], and
D = [A_3,...,A_n].
[box]
Examples: 43/10 = <4,3,3>. Also, 43/10 = <4,3,2,1>, even though these
are not the Y_i defined in (2).
43/30 = <1,2,3,4>.
If (A,B)=1 and A/B = <1,1,...,1> (k times), then A=F_{k+1} and B=F_k,
where F_i is the i'th term in the Fibonacci sequence {1,1,2,3,5,8,13,...}.
We now come to the main result of this section, proved by Euler in 1764.
Theorem 5: For any continued fraction , the following
relations hold:
[A_1,...,A_n]=[A_n,...,A_1] (5)
[A_2,...,A_n] * [A_1,...,A_{n-1}] =
[A_1,...,A_n] * [A_2,...,A_{n-1}] - (-1)^n (n>1) (6)
Proof: We use induction on n. When n=0 or n=1, (5) is tautological.
And for n=2, we have, by (4),
[A_1,A_2]= A_1 * [A_2]+ []
= A_1 * A_2 + 1.
Hence, [A_2] * [A_1] = A_2 * A_1
= (A_1 * A_2 + 1)*1 -1
= [A_1,A_2]*[]-(-1)^2,
so (6) holds.
For n>=2, we use (4) and induction on (5) repeatedly to
get:
[A_1,...,A_2] = A_1 * [A_2,...,A_n] + [A_3,...,A_n]
= A_1 * [A_n,...,A_2] + [A_n,...,A_3]
= A_1 * (A_n * [A_{n-1},...,A_2] + [A_{n-2},...,A_2])
+ A_n * [A_{n-1},...,A_3] + [A_{n-2},...,A_3]
= A_1 * A_n * [A_2,...,A_{n-1}]
+ A_1 * [A_2,...,A_{n-2}]
+ A_n * [A_3,...,A_{n-1}]
+ [A_3,...,A_{n-2}].
If we replace (A_1,...,A_n) with (A_n,...,A_1), the first and fourth
terms of the right-hand expression stay the same, and the second and
third terms get swapped. Thus,
[A_1,...,A_n] = [A_n,...,A_1]
since they both equal the same expression. This proves (5) for all n>0.
Finally, for n>=3, we can assume by induction that (6) holds for
. In other words,
[A_2,...,A_{n-1}]*[A_3,...,A_n}] = [A_3,...,A_{n-1}]*[A_2,...,A_n]
-(-1)^{n-1}.
Thus,
[A_1,...,A_n]*[A_2,...,A_{n-1}]
= (A_1*[A_2,...,A_n]+[A_3,...,A_n]) *[A_2,...,A_{n-1}]
= A_1*[A_2,...,A_n]*[A_2,...,A_{n-1}] +[A_3,...,A_n]*[A_2,...,A_{n-1}]
= A_1*[A_2,...,A_n]*[A_2,...,A_{n-1}]
+[A_3,...,A_{n-1}]*[A_2,...,A_n]+(-1)^n
= [A_2,...,A_n]*(A_1*[A_2,...,A_{n-1}]+[A_3,...,A_{n-1}])+(-1)^n
= [A_2,...,A_n]*[A_1,...,A_{n-1}]+(-1)^n.
This is (6) for , so by induction, (6) holds for all n>1.
[box]
Corollary 6: Given positive integers A,B, with A/B = ,
define positive integers A',B' by
A'/B' = (7)
where
(A',B')=(A,B) (8)
Then the following relationships hold:
A' = A; (9)
B'*B = -(-1)^k*(A,B)^2 (mod A). (10)
Proof: Let m=(A,B). Then (A/m)/(B/m) is the reduced form of A/B, whence
A = m*[Y_1,...,Y_k],
B = m*[Y_2,...,Y_k].
Similarly,
A' = m*[Y_k,...,Y_1],
B' = m*[Y_{k-1},...,Y_1].
Then (9) follows from (5), and (6) can be written as:
(B/m)*(B'/m)=(A/m)*[Y_2,...,Y_{k-1}]-(-1)^k.
We multiply both sides by m^2 to get
B*B'=A*m*[Y_2,...,Y_{k-1}]-(-1)^k*m^2.
Then (10) follows from passing to congruence (mod A).
[box]
(2) Some Models for Bombing Core
Let N be the coresize and let KZ/NZ given by f(n)=n+N. Rather
than mess with cosets, we will refer to elements of core by representatives in
Z. For a0, let U_j be the interval containing 0 in the partition after
j bombs.
Theorem 7: For all j with 0=0 is in the partition. In the first case, we have
||[a,b)||=||[d+g*K,e+G*K)||=e-d=||[d,e)||,
so we are done. Otherwise, since d is in the bombing run, we know that
d=l*K for some l with 1(3) A Better Formula for the Length of Intervals
Here is some terminology for the next result.
Let X_0=N, X_1=K, and define X_i and Y_i using Euclid's algorithm (that is,
equations (1) and (2)). Further, define Z_i inductively by Z_0=0, Z_1=1,
and
Z_{i+1}=Z_i*Y_i+Z_{i-1}. (11)
By proposition 1, we have X_0/X_1=(X_0/m)/(X_1/m)=. Applying
Corollary 3 to X_0/m,X_1/m, we get
X_0= m*[Y_1,...,Y_k]
X_1= m*[Y_2,...,Y_k]
and inductively using Corollary 4 with equation (1), we get
X_i= m*[Y_{i+1},...,Y_k] (0<=i<=k). (12)
Inductively using Corollary 4 with equation (11), we also get
Z_i= [Y_{i-1},...,Y_1] (1<=i<=k+1). (13)
Now we can describe the bombing sequence as a series of bombing runs by
looking at the intervals U_i containing 0.
Definition:
The i'th bombing run for i=1 and X_{i-1}-p*X_i>= 0, whch
means p<=Y_i.
We have by hypothesis that the first bomb in the i'th bombing run is dropped
at X_{i-1}-X_i and shrinks U_j. So, what is the gap in time between a bombs
being dropped at X_{i-1}-(p-1)*X_i and X_{i-1}-p*X_i? Since all bombs are
evenly spaced, this gap will be the same for all p, and with p=1, we know that
the Z_{i-1}'th bomb dropped at X_{i-1} and the Z_{i-1}+Z_i'th bomb drops at
X_{i-1}-X_i. Thus the value of p increases every Z_i bombs. We know that the
value of p starts at one with the Z_i+Z_{i-1}'th bomb, so we can verify that
the p given in the statement of the theorem is correct.
Next, we must verify that the bombs we just dropped were the i'th bombing
run. We started with bomb Z_i+Z_{i-1}, and dropped Z_i bombs for each p
between 1 and Y_i (except for the last bombing run, where p only ranged
between 1 and Y_i-1). Thus, the last bomb in this bombing was the Z_i+Z_{i
-1}+Y_i*Z_i-1=Z_i+Z_{i+1}-1'th bomb (or Z_{k+1}-1=N/m-1 for the last run).
From the last two paragraphs, with p=Y_i+1, we see that the first bomb of
the next bombing run is dropped at X_{i-1}-(p+1)*X_i= X_{i+1}-X_i. And with
p=Y_i, we have the Z_{i+1}'th bomb is the Z_i*Y_i+Z_{i-1}'th bomb, which drops
at X_{i-1}-X_i*Y_i= X_{i+1}. This finishes the induction
step.
[box]
(4) The Optima Step Formula
In Corewar, the step size of a bombing or scanning pattern is the distance
between successive bombing or scanning locations. Chosing a good step size is
critical to the success of a warrior, so we need a way of measuring the
effectiveness of a step size.
What makes one step size better than another? Here is one train of thought:
Larger enemies can usually be found faster than small ones. If your step takes
the same time to find small enemies as large ones, it probably needs to be
optimized against larger enemies. On the other hand, if your step size is too
coarse, small programs can slip through the cracks. For instance, assuming 8000
instructions in core, a step size of 4000 is bad. The first two bombs are
spaced well, but the pattern only hits two core locations. An ideal pattern
will divide the core into successively smaller chunks, until it has checked all
locations in core (except where your code resides).
We can measure the success of this method by adding the lengths of the largest
untouched segments during different stages of bombing. This sum is the optima
function of a coresize and step. A lower function value between step-sizes with
the same greatest common factor indicates a more optimal pattern.
Definition: The optima function is defined to be:
O(N,K)= sum_j( ||U_j|| ).
Corollary 8 assures that this is the sum we want.
Proposition 10: O(N,K) = O(N,N-K).
Proof:
In coresize N, we can consider a step-size of N-K as a step-size of K bombing
in the other direction. If the U_j are the intervals containing 0 for step
-size K, as above, then, by symmetry, the intervals U'_j containing 0 for
step-size N-K will just be the U_j reversed; that is, if for some j, U_j=[
-a,b), then U'_j=[-b,a). Note that
||U_j|| = a+b = ||U'_j||.
We sum over all j to get the result.
[box]
Theorem 11:
O(N,K)= sum_{i=1}^k(X_{i-1}*Z{i+1}-X_i*Z_i-X_i*Z_i*(Y_i*(Y_i-1)/2)
Proof: We write O(N,K)= sum_{i=1}^k O_i(N,K), where each O_i is the sum of
the ||U_j|| from the i'th bombing run. In the i'th bombing run, we have
(by theorem 9) Z_i intervals of length X_{i-1}-(p-1)*X_i, for p ranging
from 1 to Y_i. Adding these up, we think that
O_i(N,K)= sum_{p=1}^{Y_i} Z_i*(X_{i-1}-(p-1)*X_i)
= sum_{p=1}^{Y_i}(Z_i*X_{i-1})
- sum_{p=1}^{Y_i}(Z_i*X_i*(p-1))
= Y_i*Z_i*X_{i-1} - Z_i*X_i*(Y_i*(Y_i-1)/2)
Well, not quite. We forgot that the last bombing run only uses the
values of p from 1 to Y_k-1. Thus we must add the correction term
-Z_k*(X_{k-1}-(Y_k-1)*X_k)= -Z_k*(X_{k-1}-Y_k*X_k+X_k)
= -Z_k*(X_{k+1}+X_k)
= -Z_k*X_k
since X_{k+1}=0. We thus have
O(N,K)= (15)
sum_{i=1}^k( Y_i*Z_i*X_{i-1} - Z_i*X_i*(Y_i*(Y_i-1)/2) ) -Z_k*X_k.
We also have the following telescoping sum:
sum_{i=1}^k( X_i*Z_i - X_{i-1}*Z_{i-1})
= X_k*Z_k - X_0*Z_0
= X_k*Z_k
since Z_0=0. Substituting this into (15), we get
O(N,K)= (16)
sum_{i=1}^k( Y_i*Z_i*X_{i-1} - Z_i*X_i*(Y_i*(Y_i-1)/2) - X_i*Z_i
+ X_{i-1}*Z_{i-1}).
and the first and fourth terms of the sum combine thusly:
Y_i*Z_i*X_{i-1} + X_{i-1}*Z_{i-1} = X_{i-1}*(Y_i*Z_i+Z_{i-1})
= X_{i-1}*Z{i+1}.
to yield the formula we wished to prove.
[box]
Corollary 12: If {K,K''} is an imp-pair for coresize N, then
O(N,K)=O(N,K'').
Proof:
Recall that {K,K''} is an imp-pair if and only if K*K'' = 1 (mod N). For any
given K,N, we know that K'' exists if and only if (K,N)=1, and we know that
K'' is unique.
Construct K' and N' as in corollary 6. Then we have
N=[Y_1,...,Y_k],
K=[Y_2,...,Y_k],
N'=[Y_k,...,Y_1],
K'=[Y_{k-1},...,Y_1].
If we define X'_i,Y'_i, and Z'_i by X'_0=N',X'_1=K' and the recurrence
relations in section 3, we get the follwing for all applicable i:
Y'_i=Y_{k+1-i}
X'_i=Z_{k+1-i}
Z'_i=X_{k+1-i}.
If we apply theorem 11 and reverse the order of summation, we get
O(N',K')= sum_{i=1}^k(X'_{i-1}*Z'{i+1}
-X'_i*Z'_i
-X'_i*Z'_i*(Y'_i*(Y'_i-1)/2)
= sum_{i=1}^k(X'_{k+1-(i-1)}*Z'_{k+1-(i+1)}
-X'_{k+1-i}*Z'_{k+1-i}
-X'_{k+1-i}*Z'_{k+1-i}*(Y'_{k+1-i}*(Y'_{k+1-i}-1)/2)
= sum_{i=1}^k(Z_{i+1}*X_{i-1}
-Z_i*X_i
-Z_i*X_i*(Y_i*(Y_i-1)/2)
= O(N,K).
Now, by corollary 6, we have N'=N and K*K'= +/- 1 (mod N), so by uniqueness of
K'', we have either K''=K' (mod N) or K''=-K' (mod N). In the first case,
O(N,K'')=O(N',K'), and in the second case, O(N,K'')= O(N,N-K')=O(N',K') by
proposition 10.
[box]