Graph Algorithms: BFS and DFSThis lecture is our first in a series on Graph Algorithms. We begin by studying methods of searching a
graph. In order to follow these lectures, it is important that you have a basic knowledge of graph definitions,
including adjacency lists and matrix representations. References are included on the class webpage.
In this series of lectures, we use the usual notation for a graph, G, which consists of a set of vertices, V ,
and a set of edges, E. Each edge is represented as (u, v) ∈ E where u, v are vertices in V . Furthermore,
following the notation of CLRS, we assume that a vertex v stores the neighbors in and adjacency list:
Adj [v].
1
Breadth-first search
We begin with one of the simplest graph searching algorithms. The breadth-first search algorithm searches
through the graph G starting at node s, typically referred to as the source. From s, the breadth-first search
eventually discovers all other vertices in the graph that are reachable from s. The BFS search algorithm
has the following properties:
• BFS discovers all vertices reachable from s.
• It computes the shortest distance from s to every vertex in G
• It produces a “breadth-first search tree” with root s that contains all reachable vertices from s.
• The algorithm works by exploring all vertices at distance k from s before discovering any vertices at
distance k + 1
Before presenting the technical details of the algorithm, we begin with a demonstration of the search
process of BFS. The graph below begins with all vertices as “undiscovered” (marked in red). BFS starts
the search procedure at a given source node, s. This source node is typically part of the input. This is the
first node that is visited (marked in blue in the figure below) by BFS.
Next, BFS visits all vertices that are at a distance of one from the source. Recall that we measure
distance as the number of edges from s to the vertex.
1
Next, BFS visits all nodes at minimum distance 2 from s, as seen below.
This search process continues until all vertices are visited.
Whenever a new node is discovered along a particular edge in the graph, that edge is added to the BFS
tree (shown as orange edges). This tree is like a “trace” of the BFS search process. We shall see in the
next section that this tree is not necessarily unique. Depending on which vertices were visited along which
path, different BFS trees may result.
The shortest distance from the source to vertex v is given by the path of length 4 shown in the tree on
the right. Notice that there are certainly other paths in G that lead from s to v, however the path in the
BFS tree is guaranteed to have the shortest length.
This important property is stated below:
The shortest path from s to any vertex v in G is represented by a path the BFS tree
1.1
Additional tools required by the BFS algorithm:
In this section we look at the actual details of the BFS algorithm. Let’s begin by looking at the data
structures and graph attributes that are needed. The BFS algorithm processes the vertices of G by
2
marking them as “visited” once they are discovered. Therefore we will need some way to keep track of this
property. Secondly, we need a data structure that maintains a list of the vertices that should be searched
next – and that structure should ensure that the vertices are visited in order of distance from
s. We also need to keep track of which edges are part of the final BFS tree, and what the final shortest
distance is from the source to each vertex. The tools necessary for each of these requirements are explained
below:
1. Recording when a vertex has been visited:
Each vertex v contains an attribute v.visited which is a boolean variable that indicates if this vertex
has been visited. There are variations of this concept in different versions of BFS. You may come across
the use of colors, or other boolean variables. In most cases, the idea is the same: we need a key associated
to vertex v that stores whether or not that vertex has been visited. This attribute of the vertex v will be
set to true when that vertex is discovered by the BFS algorithm:
2. How to keep track of which vertices to visit next:
The BFS algorithm uses a QUEUE, Q, to maintain a list of the vertices to be visited. When a vertex
v is visited by BFS, the unvisited neighbors of v are added to the queue. The vertices are processed one
at a time by removing them (Dequeue) from the queue. An example of this process is shown below. In
this example, vertex A is removed from the queue and the neighbors of A, vertices B and C, are marked
as visited and added to the queue. At the next stage, vertex B is dequeued, and its unvisited neighbor
(vertex d) is marked as visited and added to the queue. The properties of the queue are such that we
enqueue on one end and dequeue on the other. Therefore the vertices are added to the queue in order of
their distance from s.
3. Recording the distance from s to v
The BFS procedure not only visits all the nodes in G that are reachable from s, but it also records the
minimum distance form s to any vertex v. This is done by using an additional attribute called v.distance
which represents the minimum path length from s to vertex v.
3
4. Building the breadth-first search tree during the search:
In order to store the structure of the BST tree, we need a method to record which vertices are discovered
along which path. This is done using the attributes v.children and v.parent. When a vertex v is removed
from the queue, his unvisited neighbors are added to the queue (as described above) and then we add these
neighbors as children of vertex v. In the example below left, when vertex A is removed from the queue,
the unvisited neighbors B and C are added to A.children. The parent pointers are set as: B.parent = A
and C.parent = A. The next vertex to be removed from the queue is vertex B (below right). The unvisited
neighbor D is added to B.children, and we set D.parent = B.
In summary then, each vertex uses the additional attributes which will be necessary for BFS.
v.distance,
v.parent, v.children, v.visited
We are now ready to describe the BFS algorithm using the above mentioned attributes, along with the
Queue Q.
1.2
The BFS algorithm
Input: The input to BFS is a graph G, consisting of a list of vertices V and a list of edges E. Recall
that the neighbors of a vertex v are stored in Adj [v]. In the example below, Adj [s] = {a, c}. The algorithm
also takes as input a source vertex s.
The Queue operations: The algorithm uses a queue Q, and the operation Enqueue(Q, v) in order to
add a vertex to the queue, and the operation Dequeue(Q) to remove a vertex from the queue.
BFS(G,s)
Step 1: For all vertices in G, initialize them as unvisited: for all v ∈ V set v.visited = f alse.
Initialize the parent pointers to NIL for each vertex. Initialize the list v.children to NIL for
each vertex.
Step 2: Node s is the starting node, set s.visited = true and s.distance = 0. Add s to Q.
4
Step 3: Now we are ready to carry out the search. Vertices are removed from Q one at a time. When
a vertex is visited, its unvisited neighbors are added to Q.
while Q 6= empty
u = DEQUEUE(Q)
for each v in Adj[u]
if v.visited = f alse
v.visited = true
v.distance = u.distance + 1
v.parent = u
Add v to u.children
EN QU EU E(Q, v)
We trace through the execution of the above algorithm using the example shown in step 2.
Vertex S is removed from Q. Neighbors C and A are added to Q and marked as visited. This distances
are set to s.distance + 1, and their parent pointers are set to s. Furthermore, both A and C are added to
s.children.
Vertex C is removed from Q. Neighbors B and G are added to Q and marked as visited. This distances
are set to C.distance + 1 = 1 + 1 = 2, and their parent pointers are set to C. Furthermore, both B and G
are added to C.children.
Vertex A is removed from Q. There are no unvisited neighbors.
Vertex B is removed from Q. Neighbor D is marked as visited and added to Q. We set D.distance =
B.distance + 1 = 2 + 1 = 3, and D.parent = B, and B.children = {D}.
5
Vertex G is removed from Q. Neighbors H, L, F are marked as visited and added to Q. We set each
distance to G.distance + 1 = 2 + 1 = 3, and we set each parent to G. Finally G.children = {H, L, F }.
Vertex D is dequeued. Neighbor E is marked as visited and enqueued. We set E.distance = D.distance+
1 = 3 + 1 = 4 and E.parent = D and D.children = {E}.
Vertex H is dequeued. Neighbor J is marked as visited and enqueued. We set J.distance = H.distance+
1 = 3 + 1 = 4, and H.children = {J} and J.parent = H.
Vertex L is dequeued. Neighbor K is enqueued. We set K.distance = L.distance + 1 = 3 + 1 = 4, and
L.children = {K} and K.parent = L.
6
Finally, the remaining vertices F, E, J and K are dequeued one by one. None of them have any unvisited
neighbors. The main while loop of BFS terminates. Each vertex has the correct shortest distance from s
shorted in v.distance, and the BFS tree is complete.
Runtime:
The initialization of BFS in Step 1 runs in time O(V ). During the execution of BFS, each vertex is
placed in the queue once. Therefore the runtime of adding and removing to Q over the entire execution
of BFS is O(V ). Secondly, the for loop above is executed for each vertex u in Adj [v] for every vertex v.
Since each edge is stored twice in the adjacency list, the body of the for loop is actually executed twice for
each edge in the tree. However, the run time of the for loop is still O(E). The overall runtime of BFS is
therefore O(V + E).
In the next section we look at a different search algorithm, one that doesn’t branch out in the same
way BFS does, but instead opts to search “deeper” into the graph G before branching.
2
Depth-first search
2.1
Overview
The depth-first approach is to search at “as far as possible” in the graph before branching out. Imagine
searching along a path exploring unvisited vertices as far as possible. When there are no more unvisited
vertices to discover, the search backtracks and looks for paths from the last visited vertex. As with BFS,
the DFS algorithm can run on both undirected and directed graphs.
DFS takes as input both a graph G and a source vertex s, and similarly uses the attribute v.visited.
Initially, only the source vertex is marked as visited (in blue below):
Next, DFS searches along a neighbor of the current vertex, continuing in this way as far as possible:
7
At this point the search “backtracks” and looks for a previous neighbor which leads to an undiscovered
node. The search then continues along edges to new unvisited neighbors:
Eventually all vertices are visited and DFS terminates:
Analogously, the path traced out by this DFS search is referred to as a DFS tree (shown in orange).
Notice that DFS tree in this example is very different from the BFS tree.
2.2
The DFS Algorithm
The DFS algorithm requires an attribute v.visited in order to keep track of which vertices have been
visited. The proper execution of the algorithm does not actually need anything else! However, as we shall
see in the next lecture, some other attributes may be useful once DFS has completd. Therefore we typically
use the following attributes:
• The attribute v.visited is used to mark whether or not the vertex has been visited in the search
• The attribute v.parent is used to store the parent node in the DFS tree
• The attribute v.distance is used to store the distance from s to v in the DFS tree. Note that this is
not the minimum distance in the graph G.
The algorithm does not need an external data structure such as a queue or stack, because it can be written
recursively. It is the recursive algorithm that ensures that the search continues along edges to new vertices
as long as possible. The easiest way to trace through the algorithm is to keep track of the recursive calls.
We demonstrate the process on the small undirected graph below.
DFS(s)
Step 1: For all vertices in G, initialize them as unvisited: for all v ∈ V set v.visited = f alse.
Initialize the parent pointers to NIL for each vertex. Set s.distance = 0.
Step 2: Call the recursive algorithm DFS-visit(s). This is a call to the recursive algorithm below:
8
DFS-visit(u)
Mark node u as visited: u.visited = true
For each v ∈ Adj [u]
If v.visited = f alse
v.parent = u
v.distance = u.distance + 1
DFS-visit(v)
We now illustrate the execution of the above recursive algorithm. The first call is to DFS-visit(s), using
the source vertex, s:
DFS-visit(s): The call to DFS-visit(s) marks vertex s as visited, and the neighbors of s are processed
by the for loop. Since neighbor B is not visited, its parent is set to s, it’s distance is set to s.distance + 1 =
0 + 1 = 1 and a recursive call is made to DFS-visit(B). Note that the next neighbor of s, namely C, will
not be verified until DFS-visit(B) terminates.
DFS-visit(B): Vertex B is marked as visited, and suppose the first unvisited neighbor in the for loop
is D. The parent of D is set as B, and D.distance = 2. Next, a recursive call is made to DFS-visit(D).
DFS-visit(D): Vertex D is marked as visited, and suppose the first unvisited neighbor in the for loop
is F . The parent of F is set as D, and F.distance = 3. Next, a recursive call is made to DFS-visit(F).
DFS-visit(F): Vertex F is marked as visited, and the only unvisited neighbor is E. The parent of E is
set as F , and E.distance = 4. Next, a recursive call is made to DFS-visit(E).
DFS-visit(E): Vertex E is marked as visited. There are no unvisited neighbors, so no recursive calls
are made. The call to DFS-visit(E) terminates. The current stage of the recursive process is shown in the
right below:
9
At this point in the recursion tree, the next vertex to be processed is any other unvisited neighbors of
F . Again, there are none, so DFS-visit(F) terminates. The next vertex to be processed is any unvisited
neighbors of D. Neighbor C of vertex D is unvisited. The parent of C is set as D and C.distance = 3.
Next, a recursive call is made to DFS-visit(C).
DFS-visit(C): Vertex C is marked as visited. Since there are no unvisited neighbors, the call to DFSvisit(C) terminates. At this point in the recursion tree, the next vertex to be processed is any other
unvisited neighbors of D. There are none. So the next vertex is any unvisited neighbors of B. Again,
there are none. Finally, since vertex S also has no unvisited neighbors, the exectution of all levels of the
recursion tree terminates and DFS-visit(s) is complete.
Runtime:
The DFS algorithm marks each vertex as visited only once. For each vertex, the algorithm carries out
a for loop for each neighbor of v. Over all vertices in the graph, this is equivalent to doing a constant
amount of work for each edge in the tree. The total runtime is then Θ(V + E).
10
2.3
Running DFS on the entire graph G
The DFS(s) algorithm only discovers nodes in the graph that are connected to the original node s (by
an undirected path for undirected graphs, or a directed path for directed graphs). The DFS(s) algorithm
may terminate before the entire graph G has been discovered. In order to visit each node of the graph, we
simply restart DFS as long as there are more unvisited nodes in G. The algorithm below does not take a
source vertex as input, and instead calls DFS-visit(v) for any unvisited node v. This is repeated until all
vertices of the graph are visited.
DFS(G)
Step 1: Initialize all vertices in G to unvisited. Initialize all parent pointers to NIL, and all distances
to 0.
Step 2: Now carry out DFS-visit as long as there are still unvisited vertices:
for all v ∈ V
if v.visited = f alse
DFS-visit(v)
2.4
DFS on directed graphs
The DFS algorithm on a directed graph works in the same way. The resulting DFS tree is stored in the
parent pointers which are set during the DFS algorithm. The result of calling DFS(G) may be a single
tree, or a set of trees called a DFS forest. The example below shows the result of DFS(G) on the directed
graph. The DFS forest is drawn on the right. The edges belonging to the DFS trees are shown in orange.
The remaining edges of the graph G that are not part of the DFS trees can be classified as either Back
edges, Forward Edges or Cross edges.
In summary, the completion of DFS(G) on a directed graph results in the following 4 types of edges:
• Tree Edges: edges that are in G and also part of the DFS tree
• Back edges: non-tree edges (u, v) ∈ E that connect u to an ancestor v in the tree
• Forward edges: non-tree edges (u, v) that connect a vertex u to a descendant v in the DFS tree.
• Cross edges: all other edges
11
Graph Algorithms 2: Topological sort and Strongly
connected components
In this lecture we study algorithms on directed graphs. Our first algorithm is Topological sort which is a
sorting algorithm on the vertices of a directed graph. Suppose the nodes of a graph represent courses that
are required for a specific program, and directed edge (a, b) indicates that course a must be taken before
course b. If the requirement is to take all the courses in order to complete the program, then it would be
necessary to find an ordering of the courses that respects the prerequisite requirements. Such an ordering
is in fact a Topological sort: A linear ordering of the courses so that for all edges (a, b) in the graph, course
a precedes course b in the ordering:
1
Topological sort
Formally, a topological sort is a linear ordering of V on the graph G = (V, E) such that for all (u, v) ∈ E,
the vertex u appears before v in the ordering. If the graph contains a cycle, then no topological ordering
is possible. Therefore we consider only directed acyclic graphs in this section, commonly referred to as
dags. Note that a directed graph may have more than one possible toplogical ordering, as in the example
below:
The DFS algorithm from our last lecture is in fact at the heart of finding this topological order. Let’s
start by looking at an example of a dag and the resulting DFS tree. In the figure below, we show the
original graph G, and the result of DFS-visit(A). The DFS tree is drawn on the right, showing not only
the tree edges, but also the other edge types of G (back, forward, and cross). By examining the DFS tree
and the additional edges, it is relatively easy to identify a topological ordering: A, C, D, B, F, E. How
does this ordering relate to how the vertices were discovered by DFS? It is not exactly the order in which
they were discovered. Instead, this is in fact the order in which they were finished. Notice that vertex e
1
is the first vertex from which a “backtrack” was made. In other words, the order of the recursive calls
was DFS-visit(A), DFS-visit(C), DFS-visit(D), DFS-visit(E), and then DFS-visit(E) terminated because
there were not more unvisited neighbors. Next, DFS-visit(D) made a call to DFS-visit(F), which then also
terminates because it has no more unvisited neighbors. Therefore the second-last vertex to finish is vertex
F.
This is in fact the key to selecting a topological ordering from the DFS tree: write the nodes in
decreasing order of the “time” at which they were finished in the DFS algorithm. It is important to
note that the graph may have other topological sorts, but this method produces one of them.
1.1
The use of time-stamps in DFS
DFS can be implemented to keep track of a pair of time-stamps for each vertex. These time-stamps are
useful in the Topological sort algorithm and in many other algorithms.
Time stamps in DFS
The “unit” of time is incremented each time a vertex is discovered, or when a vertex terminates
DFS-visit(v).
In the example below, we write next to each vertex a pair of time stamps that represent when the
DFS-visit(A) first discovers a vertex, and when it is finishing processing the neighbors of that vertex.
One confusion regarding DFS time-stamps is that they are often incorrectly associated with“walking” on
the edges (probably due to many online animations that associate this walking withe the DFS algorithm).
In the example above, vertex A is discovered at time 1, then vertex C, then vertex D, then vertex E. At
this point it is “time” 4. Vertex E now terminates its DFS call, and this termination counts as“a step”,
therefore its termination occurs at step 5. The finish time stamp of vertex E is 5. The next vertex to be
discovered is vertex F , which occurs at time step 6. Be careful! The time step is 6 – NOT 7. Often the
misconception is that we had to somehow walk over to vertex F , which takes 2 steps. This is incorrect.
2
Think of the time-stamps as the next event that happens, where events are either discovering or ending a
DFS call. Therefore vertex F is discovered next, at step 6, then next it terminates at time 7. The next
vertex to be discovered is vertex B, at time 8, it then terminates at time 9, and then finally D terminates
at time 10 since it has no more undiscovered neighbors. Finally C terminates at time 11, then A terminates
at time 12.
Finally, using the finish times of DFS, we can write the vertices is decreasing order of their finish time
and the result is a valid topological order, as shown in the example above.
The fact is given below:
Theorem 1. For a directed acyclic graph G, by ordering the vertices in decreasing order of their
finish time in DFS, the result is a topological ordering.
Proof: We just nee to argue that for any edge (u, v) ∈ E, the finish time of v is less than the finish time
of u. At some point during the execution of DFS, the algorithm must visit node u and consider the edge
(u, v). There are three distinct cases.
Case 1: If node v is not yet visited, then node v is visited after u, and DFS will completely finish
visiting v before it finishes u. This represents the case for edge (c, b) in the above example. When DFS
first visits vertex c, the neighbor b is not yet visited. This means that the finish time of c will be after that
of b.
Case 2:. If node v is already visited, but not yet finished, then this means there is a path from v to
u through some other part of the graph, and the edge (u, v) creates a cycle. This is impossible in a dag.
Case 3: If node v is already visited but is finished, then it’s finish time has already been stamped.
Therefore since we are still exploring u, it’s finish time will be greater than that of v. This is the case
with edge (f, e) in the above example. When DFS explores vertex f , its neighbour e is already finished.
Therefore the time stamp of f 0 s finish time will be greater than that of e.
1.2
Update DFS algorithm to record time stamps
It remains to show how to update the DFS algorithm so that we can record these time-stamps. We alter
the DFS-visit algorithm by adding a global variable called time, which is initialized as 0. When all the
neighbors of a vertex have been explored, the vertex is “finished” and we time-stamp the attribute v.f inish
with the current value of time
DFS-visit(u)
time = time + 1
Mark node u as visited: u.visited = true
u.start = time
For each v ∈ Adj [u]
If v.visited = f alse
v.parent = u
v.distance = u.distance + 1
DFS-visit(v)
time = time + 1
u.f inish = time
The topological sort algorithm follows directly from the information stored in each node after the
execution of this updated DFS:
Topological-sort(G)
Step 1: Call DFS(G) to compute the finish times.
Step 2: As each vertex is finished in DFS, insert that vertex at the front of the topological ordering.
Runtime:
3
The runtime of DFS is Θ(V + E), and the extra time needed to simply place the vertices in a linked
list in the order that they are finished is O(1) per node. Therefore the overall runtime is Θ(V + E).
2
Strongly Connected Components
One important notion that characterizes graphs is the notion of connectivity. For directed graphs, we can
describe connectivity in terms of the existence of directed paths between pairs of vertices.
A Strongly connected component, or SCC, of a graph is a maximal set of vertices such that
directed paths exist between every pair of vertices (from both u to v and v to u).
The directed graph below consists of four strongly connected components. Note that each vertex within
a strongly connected component can reach every other vertex along a path. Vertex G is in its own strongly
connected component, because there is no path to and from another vertex in G. Similarly, vertex H is in
its own SCC.
The algorithm that identifies these strongly connected components is again based on DFS. There are
three main steps:
Step 1: The first step is to call DFS(G) and compute the start/finish time stamps. The first vertex
that is picked to start DFS can be any of the vertices of the graph. Recall if DFS-visit() terminates before
visiting all vertices, it restarts from a new vertex. In the example below, DFS-visit starts on vertex E,
visits vertices G, F and then terminates at E. Next, it restarts at A, visits D, C, B and terminates at
A. Finally DFS-visit restarts at H and terminates. The corresponding DFS trees are shown on the right,
making up the DFS forest.
Step 2: Create a new graph which is the result of reversing all the edges in G. This graph is called
G .
T
4
Step 3:. Call DFS(GT ). In the main loop of DF S the vertices are considered in decreasing order of
finish time. In the example below, we start with DFS-visit(H) since it has the highest finish time. The
call terminates immediately since H has no neighbors. The next call is to DFS-visit(A) since it has the
next largest finish time. The resulting search visits A, B, C, D and then terminates. Next the call is made
to DFS-visit(E), with the third largest finish time. Vertices E and F are discovered. Finally, the last call is
to DFS-visit(G), which terminates immediately since there are no unvisited neighbors. The corresponding
DFS trees are shown on the right. Each DFS tree corresponds to a strongly connected component!
The result is exactly four strongly connected components.
Runtime: The algorithm above runs a DFS twice: once on G on again on GT . Therefore the total
runtime is Θ(V + E).
5
Graph Algorithms 3: the MST
1
Weighted graphs and the Minimum Spanning Tree
In this lecture we study algorithms on weighted graphs, those are graphs where each undirected edge has
an associated weight. The figure below is an example of a weighted graph G. We use the notation w(u, v)
for the weight of the edge from vertex u to vertex v. In this lecture we assume all weights are positive.
Weighted graphs are often used to model situations where a connection (an edge) has an associated
weight that could equal its cost, its length, etc. For example, a set of cities could be represented as vertices
of a graph, and the distances between any pair of cities is a weighted edge. Modelling this situation with
a weighted graph is shown below:
1.1
Minimum Spanning Tree
Following the above analogy, suppose we wish to connect the cities with a set of telephone wires in such a
way that every city could make a call to any other city along a path of telephone wires. If the cost of each
wire is proportional to its length, we would like to do this is such a way that minimizes the total cost of
the wires. This problem can be solved by computing the Minimum Spanning Tree (MST). In the figure
below we show two ways of connecting the cities. The choice on the left has a total cost of 1119 and the
choice on the right has a cost of 1025. The choice on the right represents the Minimum spanning tree: it
is the minimum of all possible ways of connecting the cities.
Minimum spanning tree
Let G be a weighted undirected graph. The minimum spanning tree is a set of edges that connects
all the vertices of G and whose total weight is minimized.
1
We begin this lecture by presenting specific properties of the MST, and then provide two algorithms
for determining the MST: Prim’s and Kruskal’s.
1.2
Properties of the MST
The definition of the Minimum spanning tree results in several important properties.
1. MST is not necessarily unique
The minimum spanning tree on a graph G is not necessarily unique. There may be several ways to
connect the vertices that result in the same overall weight. The example below has two different minimum
spanning trees each of total weight 3.
2. MST is a tree
The MST does not contain any cycles – therefore it is a tree. This may seem obvious from its name,
but it is important to point out why this is true. Recall that we assume all weights are positive. Imagine
for a moment that the minimum spanning tree contained a cycle, as in the figure below. Then certainly
removing one of the edges on that cycle would decrease the total weight, and the vertices would still be
connected. Therefore the MST does not contain a cycle, because removing it would decrease the total
weight. In the example below, the selected edges contain a cycle, and therefore cannot possibly be the
MST. Removing any edge on the cycle would still connect the vertices and would have smaller weight.
3. Any cut edge is in the MST
A cut edge is an edge of the graph G whose removal disconnects the graph. Any cut edge of G must
be part of any MST. Why is this true? Again, imagine for a moment that were not the case, that some
cut edge e were not part of the MST. Then since edge e is not included, G must be disconnected, therefore
it is impossible that the MST actually connects all vertices of G since the edge e is missing.
4. Adding any other edge from E to the MST creates a cycle
Since the MST is actually a tree, then if we add any additional edge, we create a cycle. This is actually
a property of trees. As can see in the figure below, the MST edge already connect all the vertices. For
example, there is a path from vertex A to vertex F in the MST. If we add the edge from A to F , then we
create a cycle (the path from A to F plus the edge F to A).
2
5. The lightest-edge across any partition must be in the MST
Given any graph G, suppose we partition them (in any way) into two groups. There may be some edges
of the graph that “cross” this partition. The lightest edge out of all the edges that cross from one “side”
to the other, must be part of the MST.
In the next section, we look at the first algorithm for constructing the MST.
2
Prim’s algorithm
The first algorithm that we study in this lecture is called Prim’s algorithm. This algorithm “grows” the
minimum spanning tree from an initial vertex, adding one edge at a time, until the entire tree is built.
The algorithm is based on property 5 above. In other words, at every step, Prim’s algorithm is able to
chose with certainly which edge must be included in the MST, and therefore it is selected and added to
the current tree.
We begin with an example of how this algorithm works before diving into the details. The basic idea is
that we keep track of how “far away” each vertex is from the current tree. The edge that is added next is
always the edge that connects the next vertex that is “closest” to our tree. In the example below, Prims’
algorithm begins at vertex A. The closest vertex that can be reached from A is vertex D. The tree now
consists of two vertices as shown on the right.
The next vertex that is closest to the current tree (the current tree contains vertices A and D) is vertex
B, at distance 4. Next, vertex H is now only distance 1 from the tree, and it is added next.
3
At this stage, both vertices C and G are at distance 3 from the tree. One of them is chosen next,
suppose vertex C. From C, next vertex G is added, and finally vertex L which is only at distance 1 from
the tree at the last step.
Prim’s algorithm selects vertices based on their current distance to the tree. That distance changes as
the tree grows. In the above example, note the vertex C is originally at distance 6 from the tree of vertices
A and D. But later, as the tree grows and B is added, vertex C is only distance 3 from the tree. Therefore
we need a data structure that can maintain these distances and update them efficiently.
2.1
The Priority Queue and distances
Each vertex requires a distance attribute which stores its current distance from the MST. This attribute
is called v.d for vertex v. Vertices are chosen one by one and added to the MST based on their distance
attribute. We require a structure that enables us to quickly determine which vertex is the “closest” to the
MST.
Prims algorithm uses a Priority Queue. This is not the same as the Queue we used to implement
BFS. The Priority Queue that we use here is a Min Heap, with the additional operation decrease-key.
The heap itself contains vertex objects, but the heap structure is built on the property v.d. Recall that we
can delete the min from a Heap in time O(log n), and we also saw how to decrease the value of a node in
the heap in time O(log n). We also saw previously in the course that we can build a heap on n items in
O(n) time. These operations together with the Heap structure are called a Priority Queue, Q. We will
use the following pseudocode operations:
• Extract-min(Q): Extracts the vertex object with minimum v.d
• Decrease-key(Q, v, w): The attribute v.d is set to w and the heap structure is updated.
We now move on to the details of Prims’ algorithm.
2.2
The algorithm
Prim’s algorithm begins building the MST from any vertex v of the graph. The starting vertex may affect
the final tree that results. However, the resulting tree is guaranteed to have minimum total weight.
We start with the initialization step:
Prim(G,s):
Step 1: Initialize the variables
– Set all v.d = ∞ for each vertex v and set parent points to NIL for all vertices in the tree.
– Initialize an empty tree T .
– Set s.d = 0.
– Insert all vertices into the priority queue Q.
The graph below shows the result of the first step of Prims algorithm where s = A. We do not “draw”
the priority queue, but instead write next to each vertex the value of v.d. Note that all vertices except A
have v.d = ∞.
4
Step 2: Remove the minimum-distance item from the queue one at a time until the queue is empty.
The vertex u with minimum distance is added to the MST. For any neighbor v of u, we check if its distance
to u is less than the current value of v.d. This is equivalent to saying that vertex v is now “closer” to the
MST.
while Q 6= N IL
u = Extract-min(Q)
for each v in Adj [u]
if v ∈ Q and v.d > weight(u, v)
*Update the distance to v from the tree
Decrease-key(Q, v, w(u, v))
v.parent = u
*Set this node as the child of u
Let’s look at the first iteration of this while loop. The node with the minimum distance is A because
A.d = 0. Vertex A is deleted from Q (which can be interpreted as it now being included in the MST). The
neighbors of A are updated to D.d = 2 and B.d = 4. At this stage, all white vertices are in the priority
queue, and the purple vertices are in the tree.
The next iteration of the while loop removes vertex D (since it has the minimum value of D.d = 2).
The neighbor C is updated:
The next iteration removes vertex B (again, its value B.d = 4 is the minimum of all distances in the
queue). The neighbors G, L, C and H all have their distances updated:
5
The next few iterations of the algorithm are shown below:
The last figure on the right represents the case where the queue is now empty. The while loop terminates.
One issue we didn’t address above was how the edges were added to the MST. Note that the above
pseudo-code sets parent pointers whenever the distance is updated in the priority queue. These parent
pointers may be updated several times for a particular vertex v. Let’s take vertex C for example. When
vertex D was added to the MST, the value of C.d was updated to 6 and its parent pointer was actually
set to D. Later, when vertex B was added to the MST, the value of C.d was updated again to 3, and
its parent pointer was reset to B. When C was finally deleted from the queue and added to the MST,
the parent pointer was B. Therefore at the moment when a vertex is deleted from Q, its parent pointer
remains fixed. This corresponds to the edge in the MST.
Runtime:
Step 1 above takes O(V ) time, since a constant amount of work is done per vertex, and a heap of size
V is built (taking time O(V )).
Step 2 carries out several operations. Let’s focus first on the Delete-min() operations. Over the entire
course of the algorithm, a vertex is removed from the queue exactly once. Recall that the priority queue
carries out a delete in O(log n) time, and in this case n = |V |. Therefore each delete in the priority
queue takes time O(log V ). Since there are V vertices, this accounts for a runtime of O(V log V ). Next,
let’s consider the Decrease-key() operations. The for loop that examines the adjacency list of each vertex
examines each edge exactly twice during the entire algorithm – and each examination may carry out a
Decrease-key() operation in (taking time O(log V )). This accounts for a total runtime of O(E log V ).
Therefore the overall runtime of step 2 is therefore O(E log V + V log V ) = O(E log V ), and thus the
runtime of Prim is O(E log V )
3
Kruskal’s algorithm
Kruskal’s algorithm operates in a similar way, except that it greedily selects the edges to add to the MST
in order of increasing edge size. The idea is quite simple: edges are added to the tree from smallest to
largest. An edge is added unless the edge connects vertices that are already connected. Therefore the tree
is not grown from a source vertex s, instead each vertex starts out as its own component, and components
are slowly merged together as new edges are added.
Kruskal’s algorithm doesn’t use a priority queue, but instead requires a structure that allows the algorithm to keep track of the different components of the tree that have currently been established. For
example, the figure below is a snapshot of an intermediary phase of Kruskal, where the MST is growing
by “components”. One component consists of vertices G, F, L and the other consists of vertices B, H. As
6
Kruskal’s algorithm continues, these components are slowly merged together until there is one resulting
tree.
3.1
Union and Find
In this section we focus on how we can efficiently keep track of which vertices are part of which components
during the execution of Kruskal. We require two main operations:
• A Merge(u,v) operation which efficiently takes the component containing vertex v and merges it
with the component containing vertex v.
• A Find(v) operation, which returns the component containing vertex v. This is important, since
we certainly don’t need to concern ourselves with merging vertices that are already in the same
component.
The solution is quite simple. We store each component as a linked-list, where the front of the list is
used to identify the component. Each vertex in the list has a pointer (for example, called mycaptain) that
points to the front of the list.
Find(u):
The Find(u) operation works by simply looking up the mycaptain attribute. Two vertices are in the
same component if they have the same mycaptain variable. This takes O(1) time.
Merge(u,v):
In order to merge two components together, we add the smaller list to the back of the bigger list. Then
we reassign the mycaptain pointers for all vertices in the smaller list. The runtime of this depends on the
number of vertices in the smaller list. Notice that when two lists merge, the size of the new list is at least
twice the size of the smaller list. Therefore for each merge operation the smaller list at least doubles, and
for a set of n elements, something can double at most O(log n) times. The figure below shows the result
of merging a component of length 4 with one of length 2. Only two pointers are reassigned.
7
3.2
The Algorithm
Kruskal’s algorithm simply sorts the edges of the graph and processes them one at a time. Processing the
edges from smallest to largest, the algorithm checks if the edge can be safely added to the tree. If the edge
is between two vertices that are already connected in the same component, then the edge is not added.
Otherwise, the edge is added.
Kruskal’s
Step 1: Initialize the variables and sort the edges
– Initialize an empty tree T .
– Sort the edges E of the graph from smallest to biggest.
– Each vertex v is placed in its own component.
Step 2: Go through the edges in sorted order. Whenever an edge connects two different components,
merge those components together.
For each each e = (u, v) in sorted order:
If F ind(u) 6= F ind(v)
Add edge (u, v) to T .
M erge(u, v)
The first edge that is processed above has weight 1. Suppose this is the edge from G to L. These 2
vertices are merged into the same component. Note that the sorted list is now shorter:
The next shortest edge is from B to H. These two vertices are merged into the same component. We
use a different color to indicate that they are currently in different components.
The next shortest edge has length 2. Suppose that edge is form G to F . Vertex F is merged into the
component containing G and L:
8
The next shortest edge has length 2, from vertex A to D:
The next shortest edge has length 3, and suppose this edge is from C to B. Therefore vertex C gets merged
with the component containing vertex B and H.
The next shortest edge has length 3, and suppose this edge is from C to G. Therefore the green and
purple components get merged
The next shortest edge has length 3. However, this edge is between two vertices that are already in the
same component. Therefore we look next at the edge of length 4. Suppose this edge is between vertex K
and L:
And finally the next edge of length 4 merges the last two components. Although there are still edges left
in the list after the merge is complete, they are emptied out (without any merges) and Kruskal’s algorithm
terminates.
9
Runtime:
In step 1, we sort the edges. This takes time O(E log E), assuming we use Mergesort. In step 2, we
carry out both Find and Merge operations. Each Find operation is constant, and we perform O(E) finds,
one for each edge in the tree. We showed above that the Merge operation is carried out at most O(log V )
times for any one vertex, so over V vertices, the total of all the merges is O(V log V ). The total runtime is
then O(V log V + E log E) = O(E log E). Note however that E ≤ V 2 and so O(E log E) = O(E log(V 2 )) =
O(E log V ). The runtime is the same as that for Prim’s.
10
Graph Algorithms 4: Single Source Shortest Path
There are many everyday problems that involve finding the shortest path from one object to another.
For example, suppose we are given a map consisting of cities and roads, and we want to find the shortest
route to particular destination. This is an example of solving the shortest-path problem.
1
Single Source Shortest path
Given a directed graph G with weighted edges, and a source vertex s, the single source shortest-path
or SSSP, is the set of all shortest paths from the source vertex, s, to any other vertex v in the graph. In the
example below, the SSSP is highlighted in orange. The edges represent the shortest paths from s to any
other vertex in the graph. For example, the shortest path from S to X follows along the path S → A → X
having total weight 6. Note that any other path to vertex X has a larger total weight.
The notion of a shortest-path is well-defined, as long as there is no negative-weight cycle that is
reachable from s. In the example below, it is clear that having a negative-weight cycle means that you
could walk around the negative-weight cycle indefinitely, making the weight of the path smaller and smaller.
Therefore we consider only directed graphs that have no negative-weight cycles.
The SSSP also satisfies the following important fact, which is important in developing the first algorithm
used to solve this problem.
Fact 1:
Suppose G is a directed graph with no negative-weight cycle. Let p be the shortest path from vertex
s to t . Suppose vertex v is an intermediary vertex on this path. The the shortest path from s to v
is given by the first part of p, and the shortest path from v to t is given by the second part of p.
1
In this lecture we look at two algorithms which solve the SSSP problem:
• Dijkstra’s algorithm: solves the SSSP problem for directed graphs with non-negative weights.
• Bellman-ford algorithm: solves the SSSP problem for directed graphs and allows for negative-weight
edges.
2
Dijkstra’s algorithm
The first algorithm we look at is Dijkstra’s algorithm – a technique used to solve the SSSP problem on
weighted directed graph with no negative-weight edges.
This algorithm is very similar to Prim’s algorithm: it grows a shortest-path tree from the source
vertex s and uses a Priority queue to maintain the current distances from s to any vertex v. In fact, the
pseudo-code below is very similar to that of Prim’s. The main difference lies is the distance attribute v.d:
Dijkstra: The v.d attribute stores the total distance from s to v
Prim: The v.d attribute stores the distance from v to the closest node in the tree.
We shall see how this difference creates a different tree as we work though the first example below.
2.1
The Algorithm:
Recall from our lecture on Prim’s algorithm that we defined two main operations on the priority queue, Q:
Extract-min(Q) and Decrease-key(Q,v,w)
These operations are defined in the same way there were for Prim’s algorithm. In particular, Decreasekey(Q,v,w) assigns v.d = w and updates the heap accordingly. Each of these run in time O(log n) for a heap
of size n. As in our demonstration of Prims algorithm, we will not illustrate the items in the priority queue.
Instead, we show the values of the v.d attributes above each vertex, and the reader must keep in mind
that those values are stored in a heap, where the minimum element is extracted using Extract-min(Q).
Dijkstra’s SSSP
Step 1: Initialize the distances: set v.d = ∞ for all vertices and set s.d = 0
Add all vertices to the priority queue Q.
Initialize an empty tree T
Step 2: Build the SSSP tree:
While Q not empty:
u =Extract-min(Q)
for each v in Adj [u]
if v.d > u.d + w(u, v)
*We have found a shorter route to v
Decrease-key(Q,v,u.d + w(u, v))
v.parent = u
2
In the first iteration of the above while loop, vertex S is removed from Q and the distance attributes
are updated for vertices A and D. The vertices in white are all those that are still in the queue, and those
in blue have already been removed.
At the next iteration, vertex D is removed from Q (it is the current minimum, because S has been removed).
Neighbor C is updated to C.d = D.d + 6 = 1 + 6 = 7. Similarly, neighbor B is updated to 4 + 1 = 5.
Vertex A is the next to be removed (A.d = 2). Neighbor C is updated to C.d = 2 + 2 = 4 and neighbor
X is updated to 2 + 4 = 6.
The remaining stages of Dijkstra’s are shown below:
3
The final tree edges are shown in orange. These edges represent the shortest path from vertex S to
any other vertex in the graph. As in Prim’s algorithm, the parent points in Dijkstra’s algorithm are set
whenever the distance is updated in the priority queue. Again, this update may occur several times.
However, once a vertex v is removed from the priority queue, the last update to v.parent is correct, and
corresponds to the actual edge in the SSSP.
Runtime: There are exactly O(V ) Extract-min operations, and each one takes O(log V ) time in a heap
of size O(V ). This accounts for a runtime of O(V log V ). Each directed edge accounts for a Decreasekey operation, and each Decrease-key runs in time O(log V ). This accounts for a runtime of O(E log V ).
Building the initial heap and all other operations above run in time O(V ). The overall runtime is then
O(E log V ).
3
Bellman Ford
The next algorithm we look at is the Bellman-Ford algorithm, which solves the general SSSP problem
given a directed graph G which may contain negative weights. Recall from above that although we
may have negative weights, no negative-weight cycle can exist. The Bellman-ford algorithm will actually
detect a negative-weight cycle and return false if one exists. Otherwise the algorithm produces all
shortest paths from S.
The main difference between Dijkstra’s algorithm and Bellman-Ford is that the latter reviews all edges
in the graph at each iteration in order to determine if any of them produce shorter paths. Recall that
Dijkstra’s algorithm only reviews the edges adjacent to the last vertex added to the shortest-path tree.
Bellman-ford does not require a Priority Queue, since we examine all edges at each iteration.
The algorithm is described below, and we use a slightly different example as we did for Dijskra’s
aglroithm (the graph below now has negative weights).
Bellman-Ford
Step 1: Initialize the distances: for each v, set v.d = ∞. Set s.d = 0.
Initialize an empty tree T
Step 2: Build the tree:
loop through exactly V − 1 times::
for each edge (u, v) in E:
if v.d > u.d + w(u, v)
v.d = u.d + w(u, v) *Found a shorter route
v.parent = u
The first iteration above processes every single edge in the tree to determine if it can update v.d. The
order in which the edges are processes certainly effects the intermediary stages. In the example below, the
edges were processed in the following order: edges out of S, edges out of A, edges out of B, edges out of
C, edges out of D, edges out of H. The result after the first iteration of the outer for loop is shown below:
4
We repeat this process again, processing the edges in the same order. It may be that many edges produce
no changes to the distance attributes, however some edges reduce the distance attributes. For example,
the edge from B to A now sets A.d = B.d − 4 = 5 − 4 = 1. Similarly, the edge from B to X reduces X.d
to X.d = 5 − 3 = 2
The next pass through all the edges results in no changes. The main for loop repeats V − 1 times, and
at each pass, no more changes are made to any of the distance attributes.
After step 2, the final SSSP should be complete. Step 3 performs an additional pass through the graph,
and if an update is made, then this means there is a negative-weight cycle in the graph, and we must
return FALSE.
Step 3: The shortest paths should now be complete. If any edge can still reduce v.d, then there must
be a negative-weight cycle. Check that by reviewing the effect of all edges:
for each edge (u, v) in E:
If v.d > u.d + w(u, v) *This means there must be a negative-weight cycle
return FALSE
return TRUE
The example below on 4 vertices contains a negative-weight cycle. Since V = 4, the SSSP should be
complete after 3 passes. The first 3 passes are shown below:
When step 3 is carried out on the above graph on the right, the edge from B to A will reduce A.d =
2−5 = −3, resulting in a return value of FALSE. In this case, Bellman-ford has identified a negative-weight
cycle.
5
Runtime: The nested for-loops in step 2 above account for a total runtime of O(V E). Step 3 runs in
time O(E) and step 1 runs in time O(V ). Thus the overall runtime is O(EV ).
4
The SSSP problem in DAGs
Recall that a DAG is a directed graph with no cycles. We saw in a previous lecture, that it is possible to
topologically sort these graphs. In this section, we look at a simple way this topological sort can be used
to determine the shortest paths from s. Note that negative-weights are acceptable for this type of graph,
since no cycles exist at all, negative or otherwise.
The topological sort enables us to determine in which order we should process the shortest paths.
Moving from left to right in the topological sort, any path from vertex u to v in the figure below can
only pass through vertices between u and v in the topological sort. Therefore we can actually process the
shortest path lengths from left to right in the topological sort.
Dag-Shortest-Path
Step 1: Topologically sort the vertices of G.
Step 2: Initialize the distances: v.d = ∞ for all vertices, except s.d = 0
Step 3: Loop through the vertices in the order they appear in the topological sort:
For each vertex u in topological order:
for each v in Adj [u]:
if v.d > u.d + w(u, v)
v.d = u.d + w(u, v)
v.parent = u
Example:
6
Runtime: Recall that Step 1 above takes time O(V + E). In step 3, we carry out a constant amount
of work for each edge, for a total of O(E), and since the for-loop itself executes V times, the total runtime
of step 3 is O(V + E). Step 2 is clearly just O(V ). The overall runtime is O(V + E), making this a much
faster algorithm than Bellman-Ford or Dijkstra’s.
7