June 10, 2022 Single Round Match 831 Editorials
SmallestOppositeNumber
It always pays off when we can break the solution of a problem into a sequence of simpler steps. In this case we can identify two such steps:
- First, we should find the digits that have to appear in Y.
- Then, we should construct the smallest number with a given set of digits.
There are multiple ways to do the first step. One way that doesn’t need to deal with X = 0 as a special case is to start by converting X to a string. We can then look at the individual characters of X, convert each back to a digit, and remember that we already saw that digit.
In the implementation shown below we choose a different way. We handle X = 0 as a special case. For a positive X we construct its digits by repeatedly looking at the value (X modulo 10) = the least significant digit of X, and then dividing X by 10 = removing the digit we just processed.
Now for the second step: given a set of (one to nine) digits, what is the smallest number that contains them all?
Clearly, the best solution is always a number that contains each of those digits exactly once. (Any other number would have more digits and thus a bigger value.)
As we have at most nine digits, we can afford to iterate over all permutations, discard the invalid ones (i.e., those that start with an unnecessary leading zero) and take the smallest out of all remaining numbers.
But we can do the same thing in a much more efficient way, if we realize the following things:
- If the only digit we have is the digit 0, the answer is 0.
- In all other cases, the first digit of the number will be the smallest non-zero digit we have, and the remaining digits will be ordered smallest to largest.
public int construct(int X) {
// fill the array seenDigit: which digits have we seen in X?
boolean[] seenDigit = new boolean[10];
if (X == 0) {
seenDigit[0] = true;
} else {
while (X > 0) {
seenDigit[ X%10 ] = true;
X /= 10;
}
}
// is the only unseen digit 0? if yes, answer 0
int seenDigitCount = 0;
for (boolean sd : seenDigit) if (sd) ++seenDigitCount;
if (seenDigitCount == 9 && !seenDigit[0]) return 0;
// find and use the smallest non-zero digit first
int answer = 0;
for (int i=1; i<10; ++i) if (!seenDigit[i]) {
answer = i;
seenDigit[i] = true;
break;
}
// use all remaining digits, smallest to largest
for (int i=0; i<10; ++i) if (!seenDigit[i]) {
answer = 10*answer + i;
}
return answer;
}
LockedBoxes
Here we opted to include a problem that actually has a solution for constraints much bigger than the ones we used in the problem. Thus, there were two very distinct ways in which you could solve this problem: either by finding and implementing an efficient algorithm (i.e., one that runs in polynomial time), or by implementing a reasonably efficient brute force solution. We’ll start by explaining the brute force approach, then we’ll discuss the better solution.
We can first start by finding a better representation for the data we got in the input. Clearly, multiple copies of the same key in the same box are redundant and can be discarded – we only care whether or not box x contains (at least one) key y.
There are 2^N possibilities for the set of boxes we should pry open. As N <= 20, we can afford to iterate over all of them. All we need now is a quick way to check whether a given set of opened boxes allows us to eventually open all other boxes. Once we know how to make that check, we can simply find and return the smallest set for which the check passes.
And implementing the check is fairly straightforward: this is just an ordinary graph traversal. For example, we can use breadth-first search (BFS). The boxes are the vertices. We start from all the boxes we forced open. The keys are edges: if box x contains key y, we have a directed edge from x to y meaning “if x is open, you can also open y”. The check passes if the BFS visits all vertices = opens all remaining boxes.
For an even simpler solution, we can store the information about keys in a very compact way by using a bitset: for each box x we’ll simply remember an N-bit number unlocks[x], in which bit y is set to 1 if and only if box x contains key y. Each unlocks[x] can be understood as the set of boxes we’ll be able to open if we somehow open box x.
Additionally, in each unlocks[x] we will set bit x to 1. (“If we can open box x, we can open box x.”) This is just a technicality that will make the next step slightly easier.
If you already have boxes in some set S open, the union of unlocks[x] over all x in S tells you the set of boxes you can open now. Take that as your new set S of open boxes. As set S only grows in each step, this process will terminate after at most N steps. At that moment, S are all the boxes you could eventually open. If something is missing from S, the check failed.
An even better way of doing the checks is to precompute the transitive closure of the given graph: if we have edges x -> y and y -> z (opening x allows you to open y, and opening y allows you to open z), we can also add the edge x -> z (opening x eventually allows you to open z). A simple way to compute the transitive closure in O(N^3) time is essentially a special case of the Floyd-Warshall shortest path algorithm.
Once we computed the transitive closure, for each box x we have the set eventually_unlocks[x] of boxes we can unlock in one or more steps when starting from x. At this point, we can do the check for any set of initially open boxes simply by taking the union of their eventually_unlocks. This gives us a solution in O(N*2^N) time.
And that brings us to the polynomial-time solution. Take the directed graph we have and divide it into strongly connected components (SCCs). Clearly, once you can open one box in a strongly connected component, you can open everything else in that component, and also all other boxes reachable from that component.
Some of the SCCs are “top” SCCs: SCCs such that no edge from the outside enters them. For any such SCC there is no way to open its boxes without forcing open at least one of them. Thus, the optimal solution needs at least as many boxes as there are “top” SCCs. But on the other hand we can easily verify that opening exactly one box in each “top” SCC and nothing else is a valid solution.
Given how small our N is, we can do the transitive closure. Then, the SCC that contains some specific vertex v is simply the intersection of two sets: vertices reachable from v and vertices from which v is reachable. This gives us an O(N^3) solution that could be further optimized to run in O(M+N) time if we had really large inputs.
int popCount(int x) {
int answer = 0;
while (x != 0) { ++answer; x &= x-1; }
return answer;
}
// openAllSlow: exponential solution with bitmasks and transitive closure
public int[] openAllSlow(int N, int[] box, int[] key) {
boolean[][] G = new boolean[N][N];
for (int i=0; i<box.length; ++i) G[ box[i] ][ key[i] ] = true;
for (int i=0; i<N; ++i) G[i][i] = true;
for (int k=0; k<N; ++k) for (int i=0; i<N; ++i) for (int j=0; j<N; ++j)
if (G[i][k] && G[k][j]) G[i][j] = true;
int[] unlocks = new int[N];
for (int i=0; i<N; ++i) for (int j=0; j<N; ++j)
if (G[i][j]) unlocks[i] |= 1<<j;
int best = N+1, breakOpen = -1;
for (int subset=0; subset<(1<<N); ++subset) {
int curr = popCount(subset);
if (curr >= best) continue;
int opened = 0;
for (int n=0; n<N; ++n) if ((subset & 1<<n) != 0) opened |= unlocks[n];
if (opened == (1 << N) - 1) { best = curr; breakOpen = subset; }
}
int[] answer = new int[best];
for (int n=0, k=0; n<N; ++n) if ((breakOpen & 1<<n) != 0) answer[k++] = n;
return answer;
}
// openAllFast: polynomial solution that finds top SCCs
public int[] openAllFast(int N, int[] box, int[] key) {
boolean[][] G = new boolean[N][N];
for (int i=0; i<box.length; ++i) G[ box[i] ][ key[i] ] = true;
for (int i=0; i<N; ++i) G[i][i] = true;
for (int k=0; k<N; ++k) for (int i=0; i<N; ++i) for (int j=0; j<N; ++j)
if (G[i][k] && G[k][j]) G[i][j] = true;
int[] outwards = new int[N];
for (int i=0; i<N; ++i) for (int j=0; j<N; ++j)
if (G[i][j]) outwards[i] |= 1<<j;
int[] inwards = new int[N];
for (int i=0; i<N; ++i) for (int j=0; j<N; ++j)
if (G[i][j]) inwards[j] |= 1<<i;
int[] almostAnswer = new int[N];
int taken = 0;
boolean[] seen = new boolean[N];
for (int n=0; n<N; ++n) {
if (seen[n]) continue;
int component = (outwards[n] & inwards[n]);
boolean isTop = true;
for (int i=0; i<N; ++i) for (int j=0; j<N; ++j) {
if (G[i][j] && ((component & 1<<i) == 0)
&& ((component & 1<<j) != 0)) {
isTop = false;
}
}
if (isTop) almostAnswer[taken++] = n;
for (int i=0; i<N; ++i) if ((component & 1<<i) != 0) seen[i] = true;
}
int[] answer = new int[taken];
for (int i=0; i<answer.length; ++i) answer[i] = almostAnswer[i];
return answer;
}
RerollCheater
We have the C values we rolled initially and a sequence of N values that will come up in eventual rerolls.
Regardless of what we do, the final C values will surely be some C out of these C+N values.
Let’s now look at all C+N values we have, find the C largest ones (breaking ties arbitrarily) and color all of them red. Out of all sets of C numbers chosen from the input, the red set has the highest possible sum.
If we could always do the rerolls so that we’ll end with the C red numbers, we would clearly have an optimal solution.
And we can always do this. If there are K non-red numbers among the ones we rolled initially, this means that there are exactly K red numbers among the values that will come up in rerolls. We can now do the rerolls for example by K iterations of the following process: take any non-red number you have and reroll it until you get a red result.
public int[] reroll(int[] currentDice, int[] futureRolls) {
int C = currentDice.length, F = futureRolls.length;
// color the largest C values red (mark them as “wanted”)
boolean[] wantedCurrent = new boolean[C];
boolean[] wantedFuture = new boolean[F];
int wanted = C;
for (int value=20; value>=1; --value) {
for (int c=0; c<C; ++c) if (currentDice[c] == value && wanted > 0) {
wantedCurrent[c] = true;
--wanted;
}
for (int f=0; f<F; ++f) if (futureRolls[f] == value && wanted > 0) {
wantedFuture[f] = true;
--wanted;
}
}
// fill an ArrayList with the rerolls we want to make
ArrayList<Integer> answer = new ArrayList<Integer>();
int where = 0;
for (int c=0; c<C; ++c) if (!wantedCurrent[c]) {
while (true) {
answer.add(c);
++where;
if (wantedFuture[where-1]) break;
}
}
// convert the ArrayList to a int[] of a fixed size and return it
int[] finalAnswer = new int[answer.size()];
for (int i=0; i<answer.size(); ++i) finalAnswer[i] = answer.get(i);
return finalAnswer;
}
Potatoes
This is a graph theory problem. Imagine that soil cells are vertices of a graph, and edges connect pairs of vertices that share an edge. This graph is bipartite: if we color the entire field in a checkerboard pattern, each edge connects a black cell to a white one. Valid potato placements are precisely all independent sets in the given graph – i.e., subsets of vertices such that no two vertices in the subset are connected by an edge.
Thus, we can solve the problem by finding the largest independent set in the given graph, comparing its size to the number N of potatoes we have, and if the size is sufficient, we can use any N vertices of the maximum independent set as the locations where to plant our potatoes. (Subsets of an independent set are clearly also independent.)
In bipartite graphs, independent sets and their complements, which are called vertex covers, are closely tied to matchings. (A vertex cover is formally defined as a set V of vertices such that each edge has at least one of its endpoints in V. Convince yourself, that the complement of any vertex cover is an independent set and vice versa.)
In particular, Kőnig’s theorem tells us that the size of the maximum matching is the size of the minimum vertex cover in the given graph, and the complement of that minimum vertex cover is the maximum independent set we seek.
One inequality in the above statement is clear: If there is a matching with X edges, any vertex cover must have at least X vertices because no vertex can “cover” more than one edge from the matching.
We can prove the other inequality and thus the above theorem constructively: by showing how to find a vertex cover of the same size as the maximum matching.
Suppose our bipartite graph has a left and a right partition. Given any matching, augmenting paths are paths that start in an unmatched vertex of the right partition, alternately use edges that don’t belong and do belong into the matching, and end in an unmatched vertex of the left partition. The simplest polynomial-time algorithm to find the maximum matching in a bipartite graph consists of repeatedly using BFS to find an augmenting path and then “xor-ing” that path with the current matching to improve its size by 1.
Once we have constructed the maximum matching, run the BFS one more time (starting from all currently unmatched vertices in the right partition). This BFS will reach some part of the graph, but it will never reach any unmatched vertex in the left partition. Once the BFS terminates, we can construct a set V as follows:
- For each edge in the matching, if the BFS reached its left vertex, take it into V.
- And for each edge in the matching, if the BFS did not reach its left vertex, take its right vertex into V.
The set V now has exactly one vertex from each edge of the matching, so it has the same size. We claim that V is a valid vertex cover. (And thus V is the minimum vertex cover and its complement are the optimal potato-planting locations.) Why is this so? V clearly “covers” the edges that form the matching. Out of the other edges, some were traversed by the BFS and some weren’t. Each edge that was traversed has its left endpoint in V (obviously) and each edge that wasn’t traversed has its right endpoint in V. (This is because the right endpoint of such an edge cannot be unmatched, so it’s a matched vertex that wasn’t reached by the BFS, and its match from the left side therefore also wasn’t reached by the BFS.)
The above gives us a polynomial-time algorithm to find the maximum independent set in a bipartite graph. Already the basic implementation should be fast enough for our constraints (up to 1250 vertices in each partition, and each vertex has degree at most 4).
JumpyCheckers
Constraints for this problem were chosen so that solutions exponential in R*C would time out by a wide margin, but solutions exponential in R+C or min(R,C) would be at least fast enough to precompute all possible answers. We are aware of many such solutions, with very different bases of the exponential function. Depending on which of those solutions you discover, this may offer an interesting trade-off between the time spent implementing such a solution and the time it then needs to obtain answers for all 210 distinct inputs.
For complexity estimates below, assume R <= C, and assume that each big-Oh is actually Oh-hat that hides not only constants but also polynomial terms.
Step one towards actually finding a solution of the type mentioned above is to note that it may be easier to count the complement: configurations in which no jumps are possible. These configurations are much more structured. As soon as there are two diagonally adjacent pieces anywhere in such a configuration, the entire diagonal must be full of pieces. (If there were any empty cells on that diagonal, go from our two pieces along the diagonal until you find the first empty cell, and that’s where a jump can be made.)
Thus, the non-jumpy configurations consist of some full diagonals and some isolated pieces between them.
One fairly straightforward technique (albeit moderately annoying in terms of implementation) to count all non-jumpy configuration is to pick one direction of diagonals and then build the configuration by systematically adding one diagonal at a time. For example, consider the figure below:
1.2.3.4.5.
.2.3.4.5..
2.3.4.5…
.3.4.5….
3.4.5…..
The numbers 1, 2, 3, 4, 5 show the first five diagonals in the chosen direction. We can incrementally build our configuration by choosing what’s on diagonal 1, then what’s on diagonal 2, and so on – always making sure the next diagonal is compatible with the previous one.
Once we have some partial solution, let’s now look at the other direction of diagonals. Two of those are denoted using * and # below.
*.2.#.4.5.
.*.3.#.5..
2.*.4.#…
.3.*.5….
3.4.*…..
Each of these diagonals is in one of three distinct states: either it’s known to be completely full, or we just placed an empty cell on it, or we just placed a piece on it (that is the end of a sequence of empty cells and pieces in which no two pieces are adjacent). This gives us at most O(3^R) distinct states and we can use DP / memoization to count the number of ways in which each of those can be reached.
A more precise estimate of the number of actually reachable states for any fixed diagonal is O(5^(R/2)), which is approximately O(2.24^R). This is because except for some negligible pathological cases for any two adjacent diagonals at least one of them is in the state “currently ends in an empty cell”, so for any two adjacent diagonals there are at most five valid pairs of states.
When placing the next diagonal, the options to consider are a completely full diagonal, or a pattern in which no two pieces are adjacent. The number of such patterns for a diagonal of length L is roughly 1.62^L (the exact counts are Fibonacci numbers).
This gives us a guarantee that the solution that takes all reachable states for the previous diagonal and tries adding all possible contents for the next diagonal runs in roughly 3.62^R. Further speedup is possible (if necessary) by first deriving all information we can from the previous state (i.e., each full diagonal must continue with a piece, and each non-trivial diagonal that just had a piece must continue with an empty cell) and then only generating the valid ways to fill in the rest of the next diagonal. Solutions of this type are already fast enough to precompute all answers.
There are multiple solutions asymptotically faster than the ones described above. What is the best base B such that you can find a solution that runs in O(B^R) time? Is B=2 possible? Is B<2 possible? Or is there even a formula that can be evaluated in polynomial time?
misof