Graph Algorithms

Bart Massey

Review: Sets

  • Collections of unique unordered values

  • Representations:

    • Characteristic Vector e.g. bitset O(1) time O(|U|) space

    • Search Trie O(lg |S|) time O(|S|) space

    • Unordered List, Ordered List

Graphs

  • Set of vertices V and set of edges E, where an edge is a pair of vertices drawn from V

  • Graphs are usually undirected, so the edge (v1, v2) and the edge (v2, v1) are considered the same. Or can represent with two edges

  • A directed graph has ordered edges, so (v1, v2) is not generally the same as (v2, v1). Think of the edges as arrows from the first vertex to the second.

Generalizations of Graphs

  • A multigraph treats edges as a collection rather than a set: the same pair of vertices can be connected by many equivalent edges.

  • A hypergraph treats edges as sets of vertices: a single edge can connect many vertices.

Labeled Graphs

  • It is common to label the vertices, edges or both.

  • It is convenient to treat the labeling as a separate function over some domain L of label values

      l: V -> L
      l': E -> L'
    
  • This avoids gumming up the graph.

Example: Highway Graph

  • Consider a graph with V = set of US cities and E = highway connections between cities.

  • We're already in trouble, because more than one highway may connect two cities. Usually pick the most relevant one for a given problem.

  • Maybe label the vertices with city names. Could be populations or somesuch depending on the problem. Could be nothing for some problems.

  • Maybe label the edges with a relevant highway property: length, travel-time, accident probability, whatever.

Example: Dependency Graph

  • Consider a graph with V = set of tasks to be performed and E = end-start dependencies. That is, if task 1 has to be done before task 2 can be started, there is an edge (v1, v2) in E.

  • Label the vertices with a time-to-complete (and a name, probably).

  • For example, consider making a turkey dinner. The gravy can't be made until after the turkey is cooked.

  • This is clearly a directed graph. As it stands, it is not necessarily connected: some tasks could maybe be done anytime relative to everything else.

  • Can make it connected by adding artificial zero-length "start dinner" and "serve dinner" tasks.

  • Do we include indirect dependencies? That is, if we have ("cook turkey", "make gravy") and ("make gravy", "gravy potatoes") should we also have an explicit edge for ("cook turkey", "gravy potatoes"). Probably not: try to keep the graph sparse.

Graph Properties

  • A simple graph has no edge (v, v). This is the usual case.

  • A path through a graph (V, E) is a sequence of vertices v1, v2 .. vn of V such that each pair of adjacent vertices in the path is an edge in E (in some order)

  • Graphs are usually connected: for every pair of vertices there is a path leading from one to the other

  • A graph with no path v..v is an acyclic graph. A connected acyclic graph is a tree.

Dense and Sparse Graphs

  • For a simple graph

      |E| <= |V| (|V| - 1) / 2
    

    (why?) We say a graph close to this bound is dense.

  • For a simple graph

      |E| >= |V| - 1
    

    (why?) We say a graph close to this bound is sparse.

  • A more precise but less-used definition: a family of sparse graphs has |E| in O(|V|). A family of dense graphs has |E| in Omega(|V|^2)

Some Graph Representations

  • Edge list

    • What it sounds like
    • List is usually an array
    • Vertices may be implicit
    • Efficient for algorithms that examine all edges
    • If edges are in sorted order (and duplicated appropriately for undirected graphs) can find edges with binary search
  • Adjacency list

    • adj function from a vertex to a set of "neighbors"
    • Function is usually an array of lists or sets
    • Edges are usually implicit
    • Efficent for algorithms that examine all vertices
    • If neighbors are in sorted order, can find with binary search
  • Adjacency matrix

    • Boolean matrix with entry vi, vj true iff there is an edge connecting vi and vj
    • Edges and vertices are usually implicit
    • Efficient for algorithms that need to check for an edge given two vertices

Review: Queues

  • Container with limited access

  • "First-In First-Out": extract function gets oldest inserted element

  • Easy implementation: Singly-linked list with head and tail pointer

  • Fairly easy implementation: "Circular" array.

Graph Traversal

PROBLEM GRAPH TRAVERSAL

Given: A graph G = (V, E). A distinguished start
       node v0 in V.

Find: An enumeration of the nodes and edges
      of G reachable from v0

Breadth-First Traversal

ALGORITHM BREADTH-FIRST TRAVERSAL

P <- new empty array/dictionary
Q <- new empty queue
V <- new empty set
E <- new empty set
insert v0 into Q
while Q is not empty:
  extract v from Q
  insert v into V
  for (v,w) in edges leaving v
    insert (v, w) into E
    if w not in V
      insert (w -> v) into P
      insert w into Q
return P
  • Running time: O(|E|) on connected graph

  • Correctness: No node is inserted more than once, every node is inserted eventually, every path step is legal

  • Path optimality: Following parents from P leads to the root in shortest number of steps.

  • If G is not connected, G' = (V, E) is a connected component of G. Start the traversal at some node in G.V \ V to find another component.

Two-Coloring

  • Find an assignment of the labels Black and White to the vertices of a graph such that no Black vertex has a White neighbor or vice versa. Or fail if impossible.

  • Breadth-First traversal solves this by labeling vertices the opposite color of their parent (necessary condition), then checking all the edges to see if they are two-colored.

  • Non-theorem: any graph with maximum degree two can be two-colored. Counterexample?

Review: Stacks

  • Container with limited access

  • "Last-In First-Out": extract function gets newest inserted element

  • Easy implementation: Singly-linked list, array with stack pointer.

Depth-First Traversal

ALGORITHM DEPTH-FIRST TRAVERSAL

P <- new empty dictionary
S <- new empty stack
V <- new empty set
E <- new empty set
insert v0 into S
while S is not empty:
  extract v from S
  insert v in V
  for (v, w) in neighbors of v
    insert (v, w) into E
    if w not in V
      insert (w -> v) into P
      insert v into S
      insert w into S
      break
return P
  • Running time: O(|E|) on connected graph

  • Correctness: No node is inserted more than once, every node is inserted eventually, every path step is legal

  • Note that this is the same algorithm but with a stack instead of a queue.

  • Determine whether a connected graph is a tree by noting whether the "if" test in the algorithm fails. This will be a back edge, creating a cycle.

  • The Articulation Vertices section in the book is a nice exercise, but too long for class. Read it and ask questions next week.

Directed Graphs, DAGs, Precedence Graphs

  • Directed graph not so different: Just directed edges.

  • Rooted Connected Directed Acyclic Graph (DAG)

  • Precedence graph is DAG representing a partial order

Topological Ordering

PROBLEM TOPO ORDER

Given: A DAG G = (V, E) representing a partial order on V.

Find: A total order <= on V consistent with G.

Source-To-Sink

ALGORITHM SOURCE-TO-SINK TOPO SORT

P <- new empty sequence
Q <- new empty set
insert roots of G into Q
while Q is not empty:
  extract some vertex v from Q
  append v to P
  for w in children of v
    if all parents of w are in P
      insert w in Q
return P
  • Performance: O(|E|)

  • Correctness: Invariant on P holds

  • Note that this is the same algorithm as depth-first traversal.

  • Can also go Sink-To-Source