Tuesday, May 28, 2013

More like "hacks-imum flow"! Hahaha!

So I have a hunch that the Python maximum flow implementation on Wikipedia is implemented incorrectly. Their find_path method is as follows:

    def find_path(self, source, sink, path):
        if source == sink:
            return path
        for edge in self.get_edges(source):
            residual = edge.capacity - self.flow[edge]
            if residual > 0 and not (edge,residual) in path:
                result = self.find_path( edge.sink, sink, path + [(edge,residual)] ) 
                if result != None:
                    return result
The following condition prevents the depth-first search from getting itself stuck in cycles, as it searches over the residual network:

            if residual > 0 and not (edge,residual) in path:
That looks fine. However, consider the case where your residual network is a dense directed acyclic graph with all edge weights positive. Furthermore, the source node is the 'root' of this DAG (it's an ancestor of every other node), and there is exactly one path from the source to the sink. The number of outgoing paths from the source will increase exponentially in proportion to the number of nodes!

The following graphic illustrates such a case. The source is A and the sink is B. All black edges have residual flow 1, and the red edge has residual flow 0. As the number of horizontal and vertical 'layers' in the graph increases and the number, the number of cycle-free paths from A to X increases exponentially, and none of these paths lead to the sink. So the algorithm could potentially keep trying these faulty paths before ever stumbling upon a path that leads to B.


Of course, I've almost certainly overlooked something. After all, Wikipedia doesn't have any mistakes, right?
As a plea for your forgiveness, I'll post my max-flow code. It doesn't have any of that silly 'object-oriented' stuff. The network example I use is the one found here.
import copy

N = 7
adj = [[0]*N for _ in range(N)]
adj[0][1] = 3
adj[0][2] = 1
adj[1][3] = 3
adj[2][3] = 5
adj[2][4] = 4
adj[3][6] = 2
adj[4][5] = 2
adj[5][6] = 3
source = 0
sink = 6

# the residual network begins as a copy of the original network
residual = copy.deepcopy(adj)

def get_augpath():
    seen = [False] * N

    def dfs(cur):
        if seen[cur]: return
        seen[cur] = True

        for next in xrange(N):
            if residual[cur][next] > 0:
                if next == sink:
                    # cur has an edge directly to the sink
                    return [cur, next]
                else:
                    # suffix is a path from cur to sink, or [] if none exists
                    suffix = dfs(next)
                    if suffix:
                        return [cur] + suffix
        # no paths found
        return []
    return dfs(source)

def max_flow():
    mf  = 0

    path = get_augpath()
    # keep iterating while a path exists
    while path:
        # get the min-cost edge along our augmenting path
        flow_to_push = min(adj[path[i]][path[i+1]] for i in xrange(len(path)-1))
        mf += flow_to_push
        for i in xrange(len(path) - 1):
            residual[path[i]][path[i+1]] -= flow_to_push
            residual[path[i+1]][path[i]] += flow_to_push
        print 'pushed', flow_to_push, 'along', path

        # try find another augmenting path
        path = get_augpath()
    return mf

print max_flow()

Monday, May 20, 2013

Turtle Tower

Another nice question I encountered recently!
You want to build a tower of turtles. Each turtle has a weight and a strength. A turtle tower is stable if each turtle in the tower can support all the turtles above it; that is, if each turtle's strength is at least as large as the total weight on top of it. How many turtles are in the tallest stable tower you can make?
Dynamic programming problems jump out at you like crazed carnivorous turtles, and this is no exception. It looks like the longest increasing subsequence problem, with a hint of knapsack. The first step would be to establish an ordering of turtles such that each turtle in the ordering can be placed after all other turtles, then we'd use some simple dynamic programming to select some maximal subset!

But hang on. What's the ordering? If the turtles are $(s=5, w=5)$ and $(s=10,w=10)$ then the heavier turtle must go below. If the turtles are $(s=2,w=10)$ and $(s=15,w=5)$ then the lighter turtle must go below! If the turtles are $(s=5,w=5)$ and $(s=10,w=2)$ then the stronger turtle must go below; but if the turtles are $(s=10,w=2)$ and $(s=5,w=15)$ then the weaker turtle must go below! Hmm...

It turns out that you have to sort turtles by the sum of their weight and strength. I don't know the intuition behind why this works, but it does; and proving it is an interesting and surprisingly involved exercise.

Denote the strength and weight of turtle $T$ by $s_T$ and $w_T$, respectively.

Lemma: If a stable tower can be built of height $h$, then a stable tower can be built of height $h$ such that for any pair of turtles $A,B$ where $B$ is below $A$, then $s_B+w_B \geq s_A+w_A$.

Proof: The proof will consist of an exchange argument. We will show that for any adjacent pair of turtles in the tower, we can always arrange them so that the lower turtle has a higher strength plus weight than the upper turtle, while still maintaining the tower's stability. If this is possible, then by a 'bubble sort'-type argument, we can repeatedly swap out-of-order adjacent pairs until the tower is sorted by each turtle $t$'s $s_t+w_t$.

Consider a tower consisting only of a pair of turtles $A,B$. Our task is to prove that if $B$ is placed below $A$, then $s_B+w_B \geq s_A+w_A$.

  • Case 1: assume $B$ supports $A$ but $A$ does not support $B$; that is, $s_B\geq w_A$ and $w_B > s_A$. Adding these inequalities, we obtain $s_B+w_B>s_A+w_A$ as required. Easy peasy!
  • Case 2: assume $B$ supports $A$ and vice versa. Regardless of their ordering, the total weight of their tower will obviously be $w_A+w_B$. Furthermore, if $B$ is placed below $A$, the weight supported by the tower is given by $\min(s_A,s_B-w_A)$ and if $A$ is placed below $B$, then the weight supported by the tower is given by $\min(s_B, s_A-w_B)$. We wish to maximise the weight the tower can hold above it, and we can prove that if by placing $B$ below $A$ where $s_B+w_B\geq s_A+w_A$, this is accomplished; that is, we can show that $\min(s_A,s_B-w_A)\geq \min(s_B,s_A-w_B)$. We will work backwards from the statement we wish to prove until we reach a true statement (not the most elegant proof but whatever).
    \[\begin{aligned} \min(s_A,s_B-w_A) &\geq \min(s_B,s_A-w_B)\\ \min(s_A,s_B-w_A) + w_A + w_B &\geq \min(s_B,s_A-w_B) + w_A + w_B\\ \min(s_A+w_A+w_B,s_B+w_B) &\geq \min(s_B+w_B+w_A,s_A+w_A)\quad \text{(by distributivity of min)}\\ \min(s_A+w_A+w_B,s_B+w_B) &\geq s_A+w_A\quad \text{(by our assumption } s_B+w_B\geq s_A+w_A\text{)} \end{aligned}\] Clearly, both $s_A+w_A+w_B$ and $s_B+w_B$ are greater or equal to $s_A+w_A$. From our premise, we've reached a true statement, therefore the premise ("$B$ should be placed below $A$") is true!
These two cases are exhaustive for all situations where $B$ is placed below $A$, so our proof is complete!

The full solution is left as an exercise to the reader :) it can be be solved in $O(n^2)$ time and $O(n)$ space.

EDIT: after years, I've finally seen a satisfying explanation of the sort order. Suppose turtle $i$ supports turtle $j$ supports some stack of turtles. The total weight of the stack is no more than $s_i + w_i$. If $s_j+w_j > s_i+w_i$ then $s_j > (s_i + w_i) - w_j$, i.e. turtle $j$ supports the total weight of the stack excluding its own weight. Hence we can place turtle $j$ at the bottom of the stack.

Sunday, May 5, 2013

N Coloured Blobs

N coloured blobs wander around in groups on an open plain. Each group has its own distinct colour, and all coloured blobs in a group share the same colour. Occasionally, two groups will bump into each other: each of the blobs in the smaller group will take on the colour of the larger group, and the two groups will merge. If two groups of equal size bump into each other, the ’mergee’ is chosen at random.

A researcher monitors a coloured blob population. Initially, each blob is on its own (i.e. in a group of size 1) and has its own distinct colour. After a sufficient amount of time has passed, all the blobs have become part of one huge group of the same colour. Let K be the total number of times that a blob changed its colour.

A lower bound on K is N-1, if every blob is merged individually into the huge group, hence only changing colour once: from its initial colour to the colour of the huge group. Determine an upper bound on K that is as tight as possible.

This is a sweet little problem with a not terribly difficult solution, but it involves some interesting complexity analysis and allows us to derive a beautiful new implementation for the disjoint-set data structure.

Let's follow a single blob for its lifecycle and consider the number of times that it changes colour. Assume that at every merge, it merges with blob-set of the same size and takes on the other blob-set's colour.

After the first merge, its blob-set is size 2. After the second merge, its blob-set is size 4. We extrapolate to see that after the $k$th merge, its blob-set is size $2^k$; so if there were $N$ blobs initially, then a single blob can experience at most $\log_2 N$ merges and thus at most $\log_2 N$ colour changes. To give our final upper-bound, we assume that each blob experiences this number of colour changes, meaning that in the order of $O(N \log N)$ colour changes occur overall. (In fact, we can reduce the upper bound to $\frac{N \log_2 N}{2}$, but being a computer scientist, I'm not concerned about constant factors. seanieg89 from the boredofstudies.org forum proved this upper bound with some fancy maths, and that proof is available here.)

This demonstrates an interesting implementation of the disjoint-set data structure that I hadn't ever considered before. Give each set a label and keep track of the size of each set. When it comes time to merge two sets, overwrite the labels of the smaller set with those of the larger set. This is worst-case time $O(n \log n)$, as opposed to the union-find implementation which is marginally faster: $O(n \cdot a(n))$, where $a(n)$ is the inverse Ackermann function. However, in many of the places where we'd normally use disjoint sets, like Kruskal's algorithm, we need to sort stuff proportional to $n$ anyway, meaning the final complexity of the algorithm becomes $O(n \log n)$ anyway.

So our blob-merging operation, which has virtually no overhead and so is blisteringly fast, can actually replace the standard union-find algorithm in many situations! I've seen plenty of olympiad questions which require fast disjoint sets, but have additional complications which mean that the union-find algorithm doesn't work very nicely. In these situations, it might be preferable to merge some blobs as above.