Spanning Trees

Bart Massey


  • Two operations: insert, extract-min

  • Both run in O(lg |H|) time on a given heap H using the same basic approach.


      To extract the minimum element of a min-heap H:
          H[1] ←→ H[|H|]
          e ← H[|H|]
          shorten H by one
          return e

      To downheap H:
          p ← 1
          while 2 p <= |H|
            c ← p
            if H[2 p] < H[c]
              c ← 2 p
            if 2 p + 1 <= |H| and H[2 p + 1] < H[c]
              c ← 2 p + 1
            if c = p
            H[p] ←→ H[c]
            p ← c


      To insert an element e in a min-heap H:
          lengthen H by one
          H[|H|] ← e

      To upheap H:
          c ← |H|
          while c div 2 >= 1
            p ← c div 2
            if H[c] >= H[p]
            H[p] ←→ H[c]
            c ← p

Minimum-Weight Spanning Trees

  • A spanning tree of a connected undirected graph G = (V, E) is a subset T of E such that

    • T is a tree (contains no cycles, is connected)
    • vertices(T) = V
  • A spanning tree T on an edge-weighted graph G is minimum-weight if the sum of weights of edges of T is minimal over all possible spanning trees of G

  • Motivation: Minimum-cost road network. Given all possible routes between cities, pick a set that minimizes the total cost of connecting all the cities (with intersections only in cities)

    • Minimizes build cost, not travel time, nor min travel time

MWST: Prim's Algorithm

  • Idea: Keep adding the cheapest frontier edge to a spanning subtree.

      To find a MWST for an edge-weighted connected graph G = (V, E)
          T ← empty set
          S ← empty set
          Q ← empty min-heap of edges with edge-weight priority
          insert some vertex v from V into S
          insert all edges impinging on v into Q
          while S is a proper subset of V
            extract min-weight edge e from Q
            if both vertices of e are in S
            insert e into T
            let v be the vertex of e not in S
            insert v into S
            insert all edges impinging on v into Q
          return T
  • Correctness: Algorithm must complete. When it completes, it is guaranteed that the output will be a spanning tree of G. Optimality is trickier: Imagine that some MWST T' for G would not have been output by the algorithm: instead some larger-weight spanning tree T is output. Now, stop the algorithm at the first point where the edge selection for T does not match the choice of T'. By assumption, T' is optimal, so the weight of the subtree covered so far is optimal, and the weight of the remaining subtree of T' is going to be optimal. This means that the algorithm needs to make a mistake at this point in connecting the two subtrees: but the algorithm chooses a smallest connector of the subtrees. Thus, T' cannot be smaller than T.

  • Cost: O(|E| lg |V|), since each processed edge involves an extraction and a bounded number of insertions from the heap.

Union-Find (Disjoint-Set)

  • Want to keep track of which vertices of a graph being built are connected.

  • Idea: Keep an "upside-down tree" as vertices are added. Make each added vertex point to its parent.

  • Now the root node of this tree is a "representative" of the whole collection of connected vertices. If two nodes point to the same root, they are in the same component.

  • Can "pull up" a tree during lookup to make the lookups really cheap.

MWST: Kruskal's Algorithm

  • Idea: Keep connecting disjoint subtrees with minimum-weight edges.

      To find a MWST for an edge-weighted connected graph G = (V, E)
          T ← empty set
          for v in V
             init connected[v] to v
          for e in E in lightest to heaviest order
            if both vertices of e are in the same component
            insert e into T
            join components of vertices
          return T
  • Correctness: Makes exactly one pass, so terminates. Builds a spanning tree. Optimal by induction on tree size. Base case: builds smallest trees connecting individual vertices. Inductive case: assume the algorithm selects e to connect T1 and T2p producing T. T1 and T2 are optimal subtrees by assumption. e is the minimum-weight edge in the graph that can connect T1 and T2, so T is also optimal.

  • Cost: O(|E| lg |E2|) = O(|E| lg |V|) when union-find is faster than log time (it is).