Max-Flow Min-Cut Theorem and Karger’s Algorithm

Let’s delve into the fascinating world of network flow optimization and randomized algorithms, focusing on the Max-Flow Min-Cut theorem and Karger’s algorithm.

Greedy Algorithms and Randomized Approaches

  • Greedy algorithms make a series of choices, selecting one activity after another without backtracking.
  • Instead of never backtracking, at each step, we can revert any of the choices we’ve already made, as long as the solution is improving.
  • Randomized greedy algorithms introduce randomness at each step.
  • If you try enough times, you will eventually get it right.
  • Key question: How many times is enough?

Two things to watch out for with this approach:

  1. Running time: Could we keep changing our minds forever?
  2. With each iteration of Ford-Fulkerson, we increase the flow by ≥ 1. Therefore, at most |f| iterations!
  3. Optimality: How do we rule out bad local optima?
  4. The s-t Min-Cut certifies that our flow is optimal!

Max-Flow Min-Cut Theorem

The Max-Flow Min-Cut theorem states that the value of a maximum flow from s to t is equal to the capacity of a minimum s-t cut. This essentially means that minimizing the capacity of cuts is the same as maximizing the value of flows.

Intuition: In a maximum flow, the minimum cut better fill up, and this is the bottleneck.

Max-Flow Min-Cut in Water Pipes

  • Max-flow: The maximum amount of water we can send.
  • Min-cut: The minimum capacity of pipes, which when removed, completely disconnects the water source and sink (i.e., no water flows).

Ford-Fulkerson Algorithm

  • Find an augmenting path in the residual graph.
  • Increase the flow along that path.
  • Repeat until you can’t find any more paths, and then you’re done!

Global Minimum Cuts

(Global = no s, t, undirected, unweighted)

Karger’s Algorithm

  • Finds global minimum cuts in undirected graphs.
  • Randomized algorithm.
  • Karger’s algorithm might be wrong.
  • Compare to QuickSort, which just might be slow.
  • Why would we want an algorithm that might be wrong?
  • With high probability, it won’t be wrong.
  • Maybe the stakes are low and the cost of a deterministic algorithm is high.

Las Vegas vs. Monte Carlo Algorithms

  • QuickSort is a Las Vegas randomized algorithm.
  • It is always correct.
  • It might be slow.
  • Karger’s Algorithm is a Monte Carlo randomized algorithm.
  • It is always fast.
  • It might be wrong.

Las Vegas: Guarantees correctness, but runtime is a random variable (RV).

Monte Carlo: Correctness is an RV, but runtime is guaranteed!

Karger’s Algorithm: Running Time

  • Randomly contract edges until there are only two supernodes left.
  • We contract at most n-2 edges.
  • Each time we contract an edge, we get rid of a vertex, and we get rid of at most n – 2 vertices total.
  • Naively, each contraction takes time O(n).
  • So, the total running time is O(n2).

Probability of Success

The probability that Karger’s algorithm returns a minimum cut is at least 1/(n choose 2). Suppose G has n vertices. Then, repeating Karger’s algorithm finds a minimum cut in G with a probability of at least 0.99 in time O(n4).

Boosting Success Probabilities

  • If we have a Monte-Carlo algorithm with a small success probability,
  • and we can check how good a solution is,
  • then we can boost the success probability by repeating it a bunch and taking the best solution.
  • If we can repeat only the shorter, riskier part of the algorithm, it can be much faster (Karger-Stein)!

Union-Find Data Structure

  • Also called a disjoint-set data structure.
  • Used for storing collections of sets.
  • Supports:
  • makeSet(u): Create a set {u}.
  • find(u): Return the set that u is in.
  • union(u,v): Merge the set that u is in with the set that v is in.
  • Very fast: α(n) amortized time per operation!
  • Useful for Kruskal (MST) and Karger (Min-Cut).

Key Ideas for Ford-Fulkerson

  1. In order to increase total flow, we decreased the flow on some edges.
  2. Decreasing flow in one direction = increasing flow in the opposite direction = “sending back” this flow.

Why Residual Graphs?

  • Fix this issue by allowing “undo” operations!
  • Akin to reversing our previous decision in a greedy algorithm.
  • Encoding these operations is the main goal of a residual graph.

What is a Residual Graph?

  • Stores how much flow can be pushed through an edge.
  • Having the reverse edge capacities = current flow represents that the current flow can be “undone!”

How do we choose which paths to use?

  • The analysis we did still works no matter how we choose the paths.
  • That is, the algorithm will be correct if it terminates.
  • However, the algorithm may not be efficient!!!
  • May take a long time to terminate.
  • (Or may actually never terminate?)
  • We need to be careful with our path selection to make sure the algorithm terminates quickly.
  • Using BFS leads to the Edmonds-Karp algorithm.

Karger-Stein Algorithm

Just repeating Karger’s algorithm isn’t the best use of repetition. We’re probably going to be correct near the beginning. Instead, Karger-Stein repeats when it counts. If we wait until there are n/√2 nodes left, the probability that we fail is close to 1/2. This lets us find a global minimum cut (with high probability) in an undirected graph in time O(n2 log2(n)). Notice that we can’t do better than O(n2) in a dense graph (we need to look at all the edges), so this is pretty good.

Algorithms and Commensurability

Algorithms often rely on commensurability: a common measure of value to compare items and the “distance” between them.

  • Well-measured attributes.
  • Measurements reflect latent attributes.
  • Entire populations are appropriately represented.

Wicked Problems

  1. No definitive formulation: What exactly is the goal? To generate text at a human level? At a good enough level to help with certain tasks?
  2. No stopping rule: When are you finished training? When does it count as “working”?
  3. Solutions are not true or false, but better or worse: One LLM’s output can be better than another’s, but neither is “correct”.
  4. No immediate or ultimate test of a solution: What exactly is the LLM supposed to output for a given input? Not quite sure!
  5. No safe opportunity to learn by trial-and-error: Training is very costly in terms of time and energy; failed attempts are big setbacks.