Max-Flow Min-Cut Theorem and Karger’s Algorithm
Let’s delve into the fascinating world of network flow optimization and randomized algorithms, focusing on the Max-Flow Min-Cut theorem and Karger’s algorithm.
Greedy Algorithms and Randomized Approaches
- Greedy algorithms make a series of choices, selecting one activity after another without backtracking.
- Instead of never backtracking, at each step, we can revert any of the choices we’ve already made, as long as the solution is improving.
- Randomized greedy algorithms introduce randomness at each step.
- If you try enough times, you will eventually get it right.
- Key question: How many times is enough?
Two things to watch out for with this approach:
- Running time: Could we keep changing our minds forever?
- With each iteration of Ford-Fulkerson, we increase the flow by ≥ 1. Therefore, at most |f| iterations!
- Optimality: How do we rule out bad local optima?
- The s-t Min-Cut certifies that our flow is optimal!
Max-Flow Min-Cut Theorem
The Max-Flow Min-Cut theorem states that the value of a maximum flow from s to t is equal to the capacity of a minimum s-t cut. This essentially means that minimizing the capacity of cuts is the same as maximizing the value of flows.
Intuition: In a maximum flow, the minimum cut better fill up, and this is the bottleneck.
Max-Flow Min-Cut in Water Pipes
- Max-flow: The maximum amount of water we can send.
- Min-cut: The minimum capacity of pipes, which when removed, completely disconnects the water source and sink (i.e., no water flows).
Ford-Fulkerson Algorithm
- Find an augmenting path in the residual graph.
- Increase the flow along that path.
- Repeat until you can’t find any more paths, and then you’re done!
Global Minimum Cuts
(Global = no s, t, undirected, unweighted)
Karger’s Algorithm
- Finds global minimum cuts in undirected graphs.
- Randomized algorithm.
- Karger’s algorithm might be wrong.
- Compare to QuickSort, which just might be slow.
- Why would we want an algorithm that might be wrong?
- With high probability, it won’t be wrong.
- Maybe the stakes are low and the cost of a deterministic algorithm is high.
Las Vegas vs. Monte Carlo Algorithms
- QuickSort is a Las Vegas randomized algorithm.
- It is always correct.
- It might be slow.
- Karger’s Algorithm is a Monte Carlo randomized algorithm.
- It is always fast.
- It might be wrong.
Las Vegas: Guarantees correctness, but runtime is a random variable (RV).
Monte Carlo: Correctness is an RV, but runtime is guaranteed!
Karger’s Algorithm: Running Time
- Randomly contract edges until there are only two supernodes left.
- We contract at most n-2 edges.
- Each time we contract an edge, we get rid of a vertex, and we get rid of at most n – 2 vertices total.
- Naively, each contraction takes time O(n).
- So, the total running time is O(n2).
Probability of Success
The probability that Karger’s algorithm returns a minimum cut is at least 1/(n choose 2). Suppose G has n vertices. Then, repeating Karger’s algorithm finds a minimum cut in G with a probability of at least 0.99 in time O(n4).
Boosting Success Probabilities
- If we have a Monte-Carlo algorithm with a small success probability,
- and we can check how good a solution is,
- then we can boost the success probability by repeating it a bunch and taking the best solution.
- If we can repeat only the shorter, riskier part of the algorithm, it can be much faster (Karger-Stein)!
Union-Find Data Structure
- Also called a disjoint-set data structure.
- Used for storing collections of sets.
- Supports:
- makeSet(u): Create a set {u}.
- find(u): Return the set that u is in.
- union(u,v): Merge the set that u is in with the set that v is in.
- Very fast: α(n) amortized time per operation!
- Useful for Kruskal (MST) and Karger (Min-Cut).
Key Ideas for Ford-Fulkerson
- In order to increase total flow, we decreased the flow on some edges.
- Decreasing flow in one direction = increasing flow in the opposite direction = “sending back” this flow.
Why Residual Graphs?
- Fix this issue by allowing “undo” operations!
- Akin to reversing our previous decision in a greedy algorithm.
- Encoding these operations is the main goal of a residual graph.
What is a Residual Graph?
- Stores how much flow can be pushed through an edge.
- Having the reverse edge capacities = current flow represents that the current flow can be “undone!”
How do we choose which paths to use?
- The analysis we did still works no matter how we choose the paths.
- That is, the algorithm will be correct if it terminates.
- However, the algorithm may not be efficient!!!
- May take a long time to terminate.
- (Or may actually never terminate?)
- We need to be careful with our path selection to make sure the algorithm terminates quickly.
- Using BFS leads to the Edmonds-Karp algorithm.
Karger-Stein Algorithm
Just repeating Karger’s algorithm isn’t the best use of repetition. We’re probably going to be correct near the beginning. Instead, Karger-Stein repeats when it counts. If we wait until there are n/√2 nodes left, the probability that we fail is close to 1/2. This lets us find a global minimum cut (with high probability) in an undirected graph in time O(n2 log2(n)). Notice that we can’t do better than O(n2) in a dense graph (we need to look at all the edges), so this is pretty good.
Algorithms and Commensurability
Algorithms often rely on commensurability: a common measure of value to compare items and the “distance” between them.
- Well-measured attributes.
- Measurements reflect latent attributes.
- Entire populations are appropriately represented.
Wicked Problems
- No definitive formulation: What exactly is the goal? To generate text at a human level? At a good enough level to help with certain tasks?
- No stopping rule: When are you finished training? When does it count as “working”?
- Solutions are not true or false, but better or worse: One LLM’s output can be better than another’s, but neither is “correct”.
- No immediate or ultimate test of a solution: What exactly is the LLM supposed to output for a given input? Not quite sure!
- No safe opportunity to learn by trial-and-error: Training is very costly in terms of time and energy; failed attempts are big setbacks.