Spanning Trees
Bart Massey
Min-Heaps
Two operations: insert, extract-min
Both run in O(lg |H|) time on a given heap H using the same basic approach.
Extract-Min
To extract the minimum element of a min-heap H:
H[1] ←→ H[|H|]
e ← H[|H|]
shorten H by one
downheap(H)
return e
To downheap H:
p ← 1
while 2 p <= |H|
c ← p
if H[2 p] < H[c]
c ← 2 p
if 2 p + 1 <= |H| and H[2 p + 1] < H[c]
c ← 2 p + 1
if c = p
return
H[p] ←→ H[c]
p ← c
Insert
To insert an element e in a min-heap H:
lengthen H by one
H[|H|] ← e
upheap(H)
To upheap H:
c ← |H|
while c div 2 >= 1
p ← c div 2
if H[c] >= H[p]
return
H[p] ←→ H[c]
c ← p
Minimum-Weight Spanning Trees
A spanning tree of a connected undirected graph G = (V, E) is a subset T of E such that
- T is a tree (contains no cycles, is connected)
- vertices(T) = V
A spanning tree T on an edge-weighted graph G is minimum-weight if the sum of weights of edges of T is minimal over all possible spanning trees of G
Motivation: Minimum-cost road network. Given all possible routes between cities, pick a set that minimizes the total cost of connecting all the cities (with intersections only in cities)
- Minimizes build cost, not travel time, nor min travel time
MWST: Prim's Algorithm
Idea: Keep adding the cheapest frontier edge to a spanning subtree.
PRIM'S ALGORITHM To find a MWST for an edge-weighted connected graph G = (V, E) T ← empty set S ← empty set Q ← empty min-heap of edges with edge-weight priority insert some vertex v from V into S insert all edges impinging on v into Q while S is a proper subset of V extract min-weight edge e from Q if both vertices of e are in S continue insert e into T let v be the vertex of e not in S insert v into S insert all edges impinging on v into Q return T
Correctness: Algorithm must complete. When it completes, it is guaranteed that the output will be a spanning tree of G. Optimality is trickier: Imagine that some MWST T' for G would not have been output by the algorithm: instead some larger-weight spanning tree T is output. Now, stop the algorithm at the first point where the edge selection for T does not match the choice of T'. By assumption, T' is optimal, so the weight of the subtree covered so far is optimal, and the weight of the remaining subtree of T' is going to be optimal. This means that the algorithm needs to make a mistake at this point in connecting the two subtrees: but the algorithm chooses a smallest connector of the subtrees. Thus, T' cannot be smaller than T.
Cost: O(|E| lg |V|), since each processed edge involves an extraction and a bounded number of insertions from the heap.
Union-Find (Disjoint-Set)
Want to keep track of which vertices of a graph being built are connected.
Idea: Keep an "upside-down tree" as vertices are added. Make each added vertex point to its parent.
Now the root node of this tree is a "representative" of the whole collection of connected vertices. If two nodes point to the same root, they are in the same component.
Can "pull up" a tree during lookup to make the lookups really cheap.
MWST: Kruskal's Algorithm
Idea: Keep connecting disjoint subtrees with minimum-weight edges.
KRUSKAL'S ALGORITHM To find a MWST for an edge-weighted connected graph G = (V, E) T ← empty set for v in V init connected[v] to v for e in E in lightest to heaviest order if both vertices of e are in the same component continue insert e into T join components of vertices return T
Correctness: Makes exactly one pass, so terminates. Builds a spanning tree. Optimal by induction on tree size. Base case: builds smallest trees connecting individual vertices. Inductive case: assume the algorithm selects e to connect T1 and T2p producing T. T1 and T2 are optimal subtrees by assumption. e is the minimum-weight edge in the graph that can connect T1 and T2, so T is also optimal.
Cost: O(|E| lg |E2|) = O(|E| lg |V|) when union-find is faster than log time (it is).