Friday, December 17, 2021

Farmer (IOI 2004) solution and proof

Easy- and medium-difficulty competitive programming problems generally involve a single core technique, for example one graph algorithm, or one computational geometry tool, or one greedy method. Harder problems generally involve more than one technique.

Farmer is probably the easiest problem on the IOI 2004 paper. The task goes: fences consist of fenceposts and the segments in between the posts. Some fences are closed, and these have equal numbers of fenceposts and segments. Other fences are open , and these consist of one less segment than the number of fenceposts. You are given $M$ closed fences (loops) with $m_1, \dots, m_M$ fenceposts each and $K$ open fences (strips) of $k_1, \dots, k_K$ fenceposts each. You must choose any $Q$ fenceposts from among these fences. Whenever you pick two consecutive fenceposts, you earn the the segment in between. What is the maximum number of segments you can earn?

When presented with a non-trivial problem, it's a good idea to try make some observations.

  • We obviously want to always choose sequences of consecutive posts where possible.
  • Loops are "high-value", in the sense that if we pick all the posts in a loop with $n$ posts we earn $n$ segments. But if we pick a strip of $n$ posts, or a portion of a loop with $n$ posts, we earn just $n-1$ segments.
  • Loops are equal value regardless of size. Two loops of size 3 each has the same value as one loop of size 6.
  • When presented with a long strip and a short strip, we can always take a portion of the long strip of equal size to the short strip, earning the same number of segments. So there is never a situation where picking a short strip is advantageous to picking a portion of a long strip.
  • We earn more segments by taking $n$ posts from a long strip than by taking $n$ posts from two or more short strips. In combination with the previous observation, this means we should always take posts from long strips over short strips.

At this point we can start forming an algorithm. The best case scenario would clearly be to take only full loops, in which case we earn exactly $Q$ segments. To determine if this is possible, we must determine if some subset $m_1, \dots, m_M$ sums to $Q$. This is the subset sum problem which can be solved by dynamic programming. So this is what we check first, and if it's possible than we're done. If not, then we cannot achieve $Q$, since any other selection of posts must select at least one strip or at least one portion of a loop.

If we can't achieve $Q$, then there are two cases to consider:

  1. Suppose $\sum_i m_i > Q$. Loops are higher-value than strips, and all loops are equal value, so suppose we just keep taking full loops in any order while we haven't depleted the quota. For each of these loops, we get a number of segments equal to the number of posts. Eventually we will have taken a total of say $u$ posts, earning $u$ segments, and we will reach a loop which has more posts than our remaining quota $Q-u$. If we take a strip of $q$ posts from this loop, we earn $Q-u-1$ segments, for a total of $Q-1$ segments earned. We know that $Q$ segments are unattainable, so if we earn $Q-1$ we have achieved the best possible result. Hence this greedy algorithm solves this case.
  2. Suppose instead $\sum_i m_i < Q$. Start by taking all the loops, because they are highest-value. Now only strips remain. By our earlier observations on strips of differing lengths, we can just greedily consume strips in descending order of length while quota remains.

Even in this relatively simple problem, there are two intertwined components that require different techniques: a dynamic programming component followed by a greedy component. One of the most important skills demanded of a competitive programmer is to be able to disentangle a seemingly intractable problem into its components in this way.

Monday, December 13, 2021

2048 in Python

Let's write 2048 in Python. Disclaimer: this is 2am bored-on-the-couch code and is not representative of my professional coding style ;)

First let's implement the action of the left arrowkey. This isn't the shortest or most elegant code, but it is easy to reason about. We treat each row individually, ignore the blank elements in it, and compress the remaining elements into the left. For the element-combining logic, we first decide if anything needs to be combined, and if anything does, we consider cases. There are actually very few cases that provoke combination, and these can be enumerated as: AA, AAB, ABB, AABB, AABC, ABBC, ABCC. We can handle AAAA, ABAA, etc. through these cases too.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
def left(x):
    news = []
    score = 0
    for row in x:
        c = [v for v in row if v]
        if all(a != b for a, b in zip(c, c[1:])):
            new = c[:]
        elif len(c) == 2:
            new = [c[0]*2]; score = c[0]*2
        elif len(c) == 3:
            if c[0] == c[1]:
                new = [c[0]*2, c[2]]; score = c[0]*2
            elif c[1] == c[2]:
                new = [c[0], c[1]*2]; score = c[1]*2
        else:
            if c[0] == c[1] and c[2] == c[3]:
                new = [c[0]*2, c[1]*2]; score = c[0]*2 + c[1]*2
            elif c[0] == c[1]:
                new = [c[0]*2, c[2], c[3]]; score = c[0]*2
            elif c[1] == c[2]:
                new = [c[0], c[1]*2, c[3]]; score = c[1]*2
            elif c[2] == c[3]:
                new = [c[0], c[1], c[2]*2]; score = c[2]*2
        new += [None] * (4 - len(new))
        news.append(new)
    return news, score

The remaining arrowkey actions can be implemented by their symmetry with the left arrowkey action, via rotation. The do method will carry out the arrowkey actions. We only place a new tile on the board if the action resulted in a state change.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
def cw(x):
    return [[x[3-j][i] for j in range(4)] for i in range(4)]

def do(x, d):
    if d == 'l':
        a, s = left(x)
    if d == 'd':
        a, s = left(cw(x))
        a = cw(cw(cw(a)))
    if d == 'r':
        a, s = left(cw(cw(x)))
        a = cw(cw(a))
    if d == 'u':
        a, s = left(cw(cw(cw(x))))
        a = cw(a)
    return (placenew(a) if a != x else a), s

The placenew function puts a new 2 or 4 into an empty square, following the probabilities of the standard web version of the game.

1
2
3
4
5
6
def placenew(x):
    es = [(r,c) for r in range(4) for c in range(4) if x[r][c] is None]
    r, c = random.choice(es)
    y = deepcopy(x)
    y[r][c] = 4 if random.random() < 0.1 else 2
    return y

This function determines if our position can be escaped. If it can't be escaped with any actions, the game is over.

1
2
def escapable(x, dirs='udlr'):
    return any(do(x,d)[0] != x for d in dirs)

Let's code up a play strategy where we pick actions at random.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
x = [
    [None, None, 2, None],
    [None, None, None, None],
    [None, None, None, None],
    [None, 2, None, None],
]

def playrandom(x):
    ts = 0
    while escapable(x):
        x, s = do(x, random.choice('udlr'))
        ts += s
    return ts

How successful is this strategy? Let's play lots of random games and see the average score.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
def mean(v):
    return sum(v) / len(v)

def stdev(v):
    u = mean(v)
    return (sum([(k-u)**2 for k in v]) / (len(v)-1))**0.5

scores = []
while True:
    s = playrandom(x)
    scores.append(s)
    if len(scores) > 1:
        print(int(mean(scores)), int(stdev(scores)), max(scores))

This shows that the mean score at endgame with the random strategy is about 810. Can we do better? Anyone who's played the game a couple times will know that ideally you wanna keep your big tiles in one corner, say the bottom left corner. So alternating down and left actions might prove more effective. We can add in right actions if we get stuck with downs and lefts. In a true disaster we can try go up and back down.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
def playwell(x):
    ts = 0
    while escapable(x):
        start = ts
        x, s = do(x, 'd'); ts += s
        x, s = do(x, 'l'); ts += s
        if not escapable(x, 'dl'):
            x, s = do(x, 'r'); ts += s
            if not escapable(x, 'dlr'):
                x, s = do(x, 'u'); ts += s
                x, s = do(x, 'd'); ts += s
    return ts

This gives a mean endgame score of about 2400. Nice!