John recently bought the latest TboX game station and cannot wait to try out the adventure game. But to his surprise, the game levels are ordered in some random fashion and unlike other classical games (where the user starts with level one followed by level two, then three and so on), this game follows an unusually difficult ordering. The user manual lists that any level can be unlocked only after all of its dependent levels are unlocked, and these dependent level’s IDs may be bigger than that level’s ID. The manual looks like:
Number of levels = 4
Level ID | Dependent on Level ID |
---|---|
1 | 2,3 |
2 | None |
3 | 2 |
4 | 1,3 |
For the above example, level one can be unlocked only after levels two and three are unlocked, level three in turn can be unlocked only after level two is unlocked. Level two is not dependent on any other level and thus can be always unlocked. John, being an impatient PRO gamer, wants to complete the whole game in one sitting and is trying to list the levels in an order that he must follow to unlock the whole game. Can you help him to do so?
The answer for the above example can be [2, 3, 1, 4] (play level two followed by three, then one, then four).
Let’s choose the first level which we can start playing; intuitively it would be the level that is not dependent on any other level. It’s also crucial to observe that there can be multiple correct answers to this problem. Suppose multiple levels have zero dependencies. In that case, John can start his game with any of these levels and we can get more than one order.
Once the first level is decided let’s choose the second level, again following intuition, the second level has to be either one with zero dependencies or one which is dependent only on the level we chose on the previous step.
For the third level, we can choose either a level that has zero dependencies or one which is dependent only on our first level or which is dependent only on our second level or which is dependent only on our first and second level. Can you spot the pattern here?
This is called topological sorting.
Let’s convert the whole problem into a graph where each level is a vertex, and there is a directed node from level u to level v, if u is dependent on level v. The graph for the above chart then looks like this:
According to the problem, to reach vertex four we first need to reach all the vertices that four has a directed edge to. This reminds us of depth-first search (DFS). When run from any vertex v, DFS goes to all the vertices reachable from v. If, for every DFS call, we can store the current vertex in a list, then we get the topological sorting of the graph. This is exactly what we want.
The main algorithm for that would be:
DFS()
Iterate over all the vertices of the graph
If the current vertex is not marked:
Recursively call DFS() on the neighbors of this vertex
Insert this vertex on a global list
Implementation in C++
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
vector < vector < int >> adj; // adjacency list containing the graph
vector < int > order;
vector < bool > marked;
void DFS(int idx) {
marked[idx] = true;
for (int neighbors: adj[idx])
if (!marked[neighbors])
DFS(neighbors)
order.push_back(idx);
}
void topologicalSort(int n) {
marked.assign(n + 1, false);
for (int i = 1; i < n; i++)
if (!marked[i])
DFS(i);
// "order" now stores the topo sort.
// Do the further required operations with order
}
The only condition for topological sort to exist is that the graph should be acyclic, i.e, there should not be a cycle in the graph. It’s easy to see why that is true, we are traversing from a vertex to all its dependencies but in the case of a cycle, the vertex itself becomes one of its dependencies and thus contradicts the algorithm.
Problems involving topological sorting are easy to spot and the algorithm can be applied quickly. One just has to look for patterns, such as if any job can be achieved only after some other certain jobs are completed and make sure the resulting graph is acyclic. Problems employing just topological sort can easily be solved and are therefore standard, but a lot of beautiful problems can be created in which finding the topo-order is just a small step required to achieve the final answer.