# Graph Algorithms

**Bart Massey**

## Review: Sets

Collections of unique unordered values

Representations:

Characteristic Vector e.g. bitset O(1) time O(|U|) space

Search Trie O(lg |S|) time O(|S|) space

Unordered List, Ordered List

## Graphs

Set of vertices V and set of edges E, where an edge is a pair of vertices drawn from V

Graphs are usually undirected, so the edge (v1, v2) and the edge (v2, v1) are considered the same. Or can represent with two edges

A directed graph has ordered edges, so (v1, v2) is not generally the same as (v2, v1). Think of the edges as arrows from the first vertex to the second.

## Generalizations of Graphs

A multigraph treats edges as a collection rather than a set: the same pair of vertices can be connected by many equivalent edges.

A hypergraph treats edges as sets of vertices: a single edge can connect many vertices.

## Labeled Graphs

It is common to label the vertices, edges or both.

It is convenient to treat the labeling as a separate function over some domain L of label values

`l: V -> L l': E -> L'`

This avoids gumming up the graph.

## Example: Highway Graph

Consider a graph with V = set of US cities and E = highway connections between cities.

We're already in trouble, because more than one highway may connect two cities. Usually pick the most relevant one for a given problem.

Maybe label the vertices with city names. Could be populations or somesuch depending on the problem. Could be nothing for some problems.

Maybe label the edges with a relevant highway property: length, travel-time, accident probability, whatever.

## Example: Dependency Graph

Consider a graph with V = set of tasks to be performed and E = end-start dependencies. That is, if task 1 has to be done before task 2 can be started, there is an edge (v1, v2) in E.

Label the vertices with a time-to-complete (and a name, probably).

For example, consider making a turkey dinner. The gravy can't be made until after the turkey is cooked.

This is clearly a directed graph. As it stands, it is not necessarily connected: some tasks could maybe be done anytime relative to everything else.

Can make it connected by adding artificial zero-length "start dinner" and "serve dinner" tasks.

Do we include indirect dependencies? That is, if we have ("cook turkey", "make gravy") and ("make gravy", "gravy potatoes") should we also have an explicit edge for ("cook turkey", "gravy potatoes"). Probably not: try to keep the graph sparse.

## Graph Properties

A

*simple*graph has no edge (v, v). This is the usual case.A path through a graph (V, E) is a sequence of vertices v1, v2 .. vn of V such that each pair of adjacent vertices in the path is an edge in E (in some order)

Graphs are usually connected: for every pair of vertices there is a path leading from one to the other

A graph with no path v..v is an

*acyclic*graph. A connected acyclic graph is a tree.

## Dense and Sparse Graphs

For a simple graph

`|E| <= |V| (|V| - 1) / 2`

(why?) We say a graph close to this bound is

*dense*.For a simple graph

`|E| >= |V| - 1`

(why?) We say a graph close to this bound is

*sparse*.A more precise but less-used definition: a family of sparse graphs has |E| in O(|V|). A family of dense graphs has |E| in Omega(|V|^2)

## Some Graph Representations

Edge list

- What it sounds like
- List is usually an array
- Vertices may be implicit
- Efficient for algorithms that examine all edges
- If edges are in sorted order (and duplicated appropriately for undirected graphs) can find edges with binary search

Adjacency list

*adj*function from a vertex to a set of "neighbors"- Function is usually an array of lists or sets
- Edges are usually implicit
- Efficent for algorithms that examine all vertices
- If neighbors are in sorted order, can find with binary search

Adjacency matrix

- Boolean matrix with entry vi, vj
*true*iff there is an edge connecting vi and vj - Edges and vertices are usually implicit
- Efficient for algorithms that need to check for an edge given two vertices

- Boolean matrix with entry vi, vj

## Review: Queues

Container with limited access

"First-In First-Out": extract function gets oldest inserted element

Easy implementation: Singly-linked list with head and tail pointer

Fairly easy implementation: "Circular" array.

## Graph Traversal

```
PROBLEM GRAPH TRAVERSAL
Given: A graph G = (V, E). A distinguished start
node v0 in V.
Find: An enumeration of the nodes and edges
of G reachable from v0
```

## Breadth-First Traversal

```
ALGORITHM BREADTH-FIRST TRAVERSAL
P <- new empty array/dictionary
Q <- new empty queue
V <- new empty set
E <- new empty set
insert v0 into Q
while Q is not empty:
extract v from Q
insert v into V
for (v,w) in edges leaving v
insert (v, w) into E
if w not in V
insert (w -> v) into P
insert w into Q
return P
```

Running time: O(|E|) on connected graph

Correctness: No node is inserted more than once, every node is inserted eventually, every path step is legal

Path optimality: Following parents from P leads to the root in shortest number of steps.

If G is not connected, G' = (V, E) is a connected component of G. Start the traversal at some node in G.V \ V to find another component.

## Two-Coloring

Find an assignment of the labels Black and White to the vertices of a graph such that no Black vertex has a White neighbor or vice versa. Or fail if impossible.

Breadth-First traversal solves this by labeling vertices the opposite color of their parent (necessary condition), then checking all the edges to see if they are two-colored.

Non-theorem: any graph with maximum degree two can be two-colored. Counterexample?

## Review: Stacks

Container with limited access

"Last-In First-Out": extract function gets newest inserted element

Easy implementation: Singly-linked list, array with stack pointer.

## Depth-First Traversal

```
ALGORITHM DEPTH-FIRST TRAVERSAL
P <- new empty dictionary
S <- new empty stack
V <- new empty set
E <- new empty set
insert v0 into S
while S is not empty:
extract v from S
insert v in V
for (v, w) in neighbors of v
insert (v, w) into E
if w not in V
insert (w -> v) into P
insert v into S
insert w into S
break
return P
```

Running time: O(|E|) on connected graph

Correctness: No node is inserted more than once, every node is inserted eventually, every path step is legal

Note that this is the

*same algorithm*but with a stack instead of a queue.Determine whether a connected graph is a tree by noting whether the "if" test in the algorithm fails. This will be a back edge, creating a cycle.

The Articulation Vertices section in the book is a nice exercise, but too long for class. Read it and ask questions next week.

## Directed Graphs, DAGs, Precedence Graphs

Directed graph not so different: Just directed edges.

Rooted Connected Directed Acyclic Graph (DAG)

Precedence graph is DAG representing a partial order

## Topological Ordering

```
PROBLEM TOPO ORDER
Given: A DAG G = (V, E) representing a partial order on V.
Find: A total order <= on V consistent with G.
```

## Source-To-Sink

```
ALGORITHM SOURCE-TO-SINK TOPO SORT
P <- new empty sequence
Q <- new empty set
insert roots of G into Q
while Q is not empty:
extract some vertex v from Q
append v to P
for w in children of v
if all parents of w are in P
insert w in Q
return P
```

Performance: O(|E|)

Correctness: Invariant on P holds

Note that this is the

*same algorithm*as depth-first traversal.Can also go Sink-To-Source