Strongly connected components (SCCs) can be thought of as self-contained cycles within a directed graph where every vertex in a given cycle can reach every other vertex in the same cycle.
If you look at the graph below, you will find that it has four SCCs, each has its own self-contained cycle and for each component, there is no way to find a path that leaves a component and returns back. This property ensures that SCCs in a graph are unique.
To find SCCs we have two algorithms at hand, Tarjan’s and Kosaraju’s algorithms. In this article we are going to cover Tarjan’s algorithm.
Before we jump to Tarjan’s algorithm, we must have an understanding of low-link values.
The low-link value of a node is the smallest node ID reachable from that node when doing a depth-first search (DFS), including itself.
There is a catch with doing a DFS on the graph, as it is highly dependent on the traversal order of the DFS, which is effectively random.
We are going to prove our above statement by giving you some examples:
In the above graph, we labeled it with some data and did the DFS on the 0th node, and let the DFS play and mark every node with its low-link value. After the graph is labeled it looks like this:
Seeing the results we might feel that it’s working fine. But, to contradict our example, we will now label the nodes differently and start the DFS again from the 0th node. Let’s see how it pans out.
As we can see all the nodes have the same low-link value, but there are multiple SCCs.
Note: Depending on where the DFS starts, and the order in which nodes/edges are visited, the low-link values for identifying SCCs could be wrong.
So, to make it work, we have to use an invariant that prevents SCCs from interfering with the low-link value of other SCCs.
To cope with the random traversal order of the DFS, Tarjan’s algorithm maintains a stack of valid nodes from which to update low-link values. Nodes are added to the stack of valid nodes as they are explored for the first time. Nodes are removed from the stack each time a complete SCC is found.
If u and v are nodes in a graph and we were currently exploring u, then our new low-link update condition is:
To update node u to node v low-link there has to be a path of edges from u to v and node v must be on the stack.
We are going to update low-link values on the fly during the DFS so we can get a linear O(V+E) time complexity.
Code:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
#include<iostream>
#include<stack>
#define v 5
using namespace std;
int graph[v][v];
int min(int a, int b) {
return (a < b) ? a : b;
}
void findComponent(int u, int disc[], int lowLink[], stack < int > & stk, bool stkItem[]) {
static int time = 0;
disc[u] = lowLink[u] = ++time;
stk.push(u);
stkItem[u] = true;
for (int v = 0; v < v; v++) {
if (graph[u][v]) {
if (disc[v] == -1) {
findComponent(v, disc, lowLink, stk, stkItem);
lowLink[u] = min(lowLink[u], lowLink[v]);
} else if (stkItem[v])
lowLink[u] = min(lowLink[u], disc[v]);
}
}
int poppedItem = 0;
if (lowLink[u] == disc[u]) {
while (stk.top() != u) {
poppedItem = stk.top();
cout << poppedItem << " ";
stkItem[poppedItem] = false;
stk.pop();
}
poppedItem = stk.top();
cout << poppedItem << endl;
stkItem[poppedItem] = false;
stk.pop();
}
}
void strongConComponent() {
int disc[v], lowLink[v];
bool stkItem[v];
stack < int > stk;
for (int i = 0; i < v; i++) {
disc[i] = lowLink[i] = -1;
stkItem[i] = false;
}
for (int i = 0; i < v; i++)
if (disc[i] == -1)
findComponent(i, disc, lowLink, stk, stkItem);
}
int main() {
strongConComponent();
}
Mark the ID of each node as unvisited.
Start DFS. Upon visiting a node, assign it an ID and a low-link value. Also, mark current nodes as visited and add them to a seen stack.
On DFS callback, if the previous node is on the stack, then min the current node’s low-link value with the last node’s low-link value. After visiting all neighbors, if the current node started a connected component, then pop nodes off the stack until the current node is reached.