Tuesday, May 28, 2013

More like "hacks-imum flow"! Hahaha!

So I have a hunch that the Python maximum flow implementation on Wikipedia is implemented incorrectly. Their find_path method is as follows:

    def find_path(self, source, sink, path):
        if source == sink:
            return path
        for edge in self.get_edges(source):
            residual = edge.capacity - self.flow[edge]
            if residual > 0 and not (edge,residual) in path:
                result = self.find_path( edge.sink, sink, path + [(edge,residual)] ) 
                if result != None:
                    return result
The following condition prevents the depth-first search from getting itself stuck in cycles, as it searches over the residual network:

            if residual > 0 and not (edge,residual) in path:
That looks fine. However, consider the case where your residual network is a dense directed acyclic graph with all edge weights positive. Furthermore, the source node is the 'root' of this DAG (it's an ancestor of every other node), and there is exactly one path from the source to the sink. The number of outgoing paths from the source will increase exponentially in proportion to the number of nodes!

The following graphic illustrates such a case. The source is A and the sink is B. All black edges have residual flow 1, and the red edge has residual flow 0. As the number of horizontal and vertical 'layers' in the graph increases and the number, the number of cycle-free paths from A to X increases exponentially, and none of these paths lead to the sink. So the algorithm could potentially keep trying these faulty paths before ever stumbling upon a path that leads to B.


Of course, I've almost certainly overlooked something. After all, Wikipedia doesn't have any mistakes, right?
As a plea for your forgiveness, I'll post my max-flow code. It doesn't have any of that silly 'object-oriented' stuff. The network example I use is the one found here.
import copy

N = 7
adj = [[0]*N for _ in range(N)]
adj[0][1] = 3
adj[0][2] = 1
adj[1][3] = 3
adj[2][3] = 5
adj[2][4] = 4
adj[3][6] = 2
adj[4][5] = 2
adj[5][6] = 3
source = 0
sink = 6

# the residual network begins as a copy of the original network
residual = copy.deepcopy(adj)

def get_augpath():
    seen = [False] * N

    def dfs(cur):
        if seen[cur]: return
        seen[cur] = True

        for next in xrange(N):
            if residual[cur][next] > 0:
                if next == sink:
                    # cur has an edge directly to the sink
                    return [cur, next]
                else:
                    # suffix is a path from cur to sink, or [] if none exists
                    suffix = dfs(next)
                    if suffix:
                        return [cur] + suffix
        # no paths found
        return []
    return dfs(source)

def max_flow():
    mf  = 0

    path = get_augpath()
    # keep iterating while a path exists
    while path:
        # get the min-cost edge along our augmenting path
        flow_to_push = min(adj[path[i]][path[i+1]] for i in xrange(len(path)-1))
        mf += flow_to_push
        for i in xrange(len(path) - 1):
            residual[path[i]][path[i+1]] -= flow_to_push
            residual[path[i+1]][path[i]] += flow_to_push
        print 'pushed', flow_to_push, 'along', path

        # try find another augmenting path
        path = get_augpath()
    return mf

print max_flow()

No comments:

Post a Comment