Problem Of The Week: Disjoint Set Union And Graph Theory
This week, let's discuss a problem statement that was featured in the 2016 Topcoder Open Beijing Regionals.
The summary of the problem statement is as follows -
You are given two arrays, X and Y, of the same length.
X[ i ] and Y[ i ] are pairs that must never be separated.
Determine if its possible to extract a group of exactly K number of elements from the arrays without separating any pairs.
For example, if X and Y were the following -
X = [0, 1, 2, 3, 4]
Y = [2, 3, 0, 4, 1]
and K were equal to 3, then the answer would be "Possible", as the group {1, 3, 4} is a valid group that can be extracted from the arrays, and has a size equal to K.
In the above case, another group exists - {0, 2} however it does not have a size of K, and therefore cannot be considered as proof of validity.
Now, let's approach this in another way.
We know that for a given X[ i ], the corresponding Y[ i ] must be present in the group too, as X[ i ] and Y[ i ] are inseparable -- they are connected.
This is where graph theory comes into the picture. If we connect every X[ i ] with Y[ i ], we will consequently generate a graph, or possibly even multiple graphs.
After doing so, the formal problem statement we must then solve is to determine if there exists a graph which has exactly K vertices.
As we simply need to check the size of connected components for each graph that may arise from the arrays X and Y, we can leverage the data structure - disjoint set union.
The advantage of using a disjoint set union approach is that most implementations of this data structure are very short.
This data structure allows us to connect components and determine whether two components are connected or not, extremely efficiently.
Disjoint set union works by assigning a "parent" to every node.
Initially, every node is it’s own parent.
When two nodes are connected, the new parent is initialized as either one of the node's parents.
Thus, one can check if two nodes are connected by checking if both nodes have the same parent, as that would imply they belong to the same graph.
So now, we must connect X[ i ] and Y[ i ] for all 1 <= i <= N using this data structure.
Once this is done, we need to find the size of all graphs.
This too, is easy when using disjoint set union, and can be done by iterating over all vertices, and incrementing the size of their parents.
After doing this, we just need to iterate over all vertices and determine if any of their sizes equals K, as that would clearly be the parent of the graph containing K number of vertices.
The fastest submission was by xudyh who submitted the correct solution within 3 minutes and 21 seconds, after implementing it in C++.
The same approach was used as described here.