Tarjan's algorithm is a popular algorithm for finding strongly connected components (SCCs) in a directed graph. A strongly connected component is a subset of vertices in a directed graph where every vertex is reachable from every other vertex in the subset. In other words, the vertices in an SCC are strongly connected to each other.The algorithm works by performing a depth-first search (DFS) on the graph and maintaining several data structures as it progresses. Specifically, it uses a stack to keep track of the vertices visited in the current DFS traversal, an array to keep track of the discovery time of each vertex, and another array to keep track of the lowest discovery time of any vertex that is still reachable from the current vertex.
Index variable
Each node v is assigned a unique integer v.index, which numbers the nodes consecutively in the order in which they are discovered. It also maintains a value v.lowlink that represents the smallest index of any node on the stack known to be reachable from v through v's DFS subtree, including v itself. Therefore v must be left on the stack if v.lowlink < v.index, whereas v must be removed as the root of a strongly connected component if v.lowlink == v.index. The value v.lowlink is computed during the depth-first search from v, as this finds the nodes that are reachable from v.
Algorithm
Initialize variables: Initialize an index variable to 0 and an empty stack.
Visit each node: Iterate over each node in the graph. If a node has not been visited before (i.e., its index is undefined), call the strongconnect function on that node.
Strongly connect a node: The strongconnect function is called on a node. First, set the node's index and lowlink to the current index value, and increment the index by 1. Then, push the node onto the stack and mark it as being on the stack.
Visit the node's neighbors: Iterate over each neighbor of the node (i.e., each edge that originates from the node). If the neighbor has not been visited yet (i.e., its index is undefined), call the strongconnect function on that neighbor.
Update lowlink: If the neighbor is already on the stack (i.e., onStack is true), update the node's lowlink to the minimum of its current lowlink and the neighbor's index.
Finish the node: If the current node is a root node (i.e., its lowlink is equal to its index), it is the start of a new strongly connected component. Pop nodes off the stack until the current node is popped off, adding each node to the current strongly connected component. Mark each node as not being on the stack anymore. Output the current strongly connected component.
Repeat: Repeat steps 2-6 until all nodes have been visited.
Code
#include <vector>
#include <iostream>
#include <stack>
#include <cstring>
using namespace std;
const int node_num = 10000; // node 개수
const int edge_num = 100000; // edge 개수
int index[node_num]; // index
int index_low[node_num]; // index_low
int cur_time = 0; // 방문시각
bool on_stack[node_num]; // stack에 들어있는지 여부 판단하는 flag
vector<int> edge[node_num]; // Edge list
stack<int> s; // stack
vector<vector<int> > scc;
void tarjan(int cur_node){
index[cur_node] = index_low[cur_node] = cur_time++;
on_stack[cur_node] = 1; // 담겼다고 표시
s.push(cur_node); // stack에 추가
for(int i = 0; i < edge[cur_node].size(); i++){
int next_node = edge[cur_node][i];
// 인접한 node가 아직 방문을 하지 않은 경우
// 만약 해당 노드가 scc에 있다면 무조건 자기 자신으로 돌아오게 됨
if(index[next_node] == -1){
tarjan(next_node);
index_low[cur_node] = min(index_low[cur_node], index_low[next_node]);
}
// 방문을 했고, stack안에 있는 경우(index_low[next_node] < index[next_node]인 경우)
// 만약 방문을 했는데, stack안에 없는 경우는 이미 고려된 것
else if(on_stack[next_node]){
index_low[cur_node] = min(index_low[cur_node], index[next_node]);
}
}
vector<int> v;
if(index[cur_node] == index_low[cur_node]){
do{
int tmp = s.top();
v.push_back(tmp);
s.pop();
on_stack[tmp] = false;
}while(s.top() != cur_node);
scc.push_back(v);
}
}
int main(void){
memset(index, -1, sizeof(index));
memset(index_low, -1, sizeof(index));
memset(on_stack, 0, sizeof(on_stack));
for(int i = 0; i < edge_num; i++){
int s, e;
cin >> s >> e;
edge[s].push_back(e);
}
}
Stack Invariant
A node remains on the stack after it has been visited if and only if there exists a path in the input graph from it to some node earlier on the stack. In other words, it means that in the DFS a node would be only removed from the stack after all its connected paths have been traversed. When the DFS will backtrack it would remove the nodes on a single path and return to the root in order to start a new path.
Note
Actually, we can use index_low[cur_node] = min(index_low[cur_node], index_low[next_node]) instead of index_low[cur_node] = min(index_low[cur_node], index[next_node])