October 8, 2022 Single Round Match 835 Editorials

GoodLetters

Here’s an easy approach:

  • Use a boolean array to store the information which letters are good.
  • Count the good letters.
  • Count the bad letters, or deduce that their count is 26 – count of good letters.
  • Check whether you have enough of each type.
  • If no, report no solution.
  • If yes, iterate over the boolean array to collect as many as you need of each type.
boolean[] getGood(String good) {
    boolean[] answer = new boolean[26];
    for (char c : good.toCharArray()) answer[ c-'A' ] = true;
    return answer;
}

public String construct(String good, int N, int G) {
    boolean[] isGood = getGood(good);

    int countGood = 0;
    for (boolean x : isGood) if (x) ++countGood;
    int countBad = 26 - countGood;

    if (G > countGood || N-G > countBad) return "";

    int needGood = G, needBad = N-G;
    String answer = "";
    for (char c='A'; c<='Z'; ++c) {
        if ( isGood[c-'A'] && needGood > 0) { answer += c; --needGood; }
        if (!isGood[c-'A'] && needBad  > 0) { answer += c; --needBad;  }
    }
    return answer;
}

Doors

This problem is solvable greedily – it’s actually a special case of the minimum spanning tree (MST) problem. Or, more precisely, the minimum spanning forest, as our building can consist of multiple connected components.

To solve the problem we can either use one of the efficient general algorithms (e.g., Jarník-Prim or Kruskal), or we can implement a simpler specialized version. Below we explain a full derivation of the simpler specialized solution.

Observation 1: If we build the doors optimally, there will be no cycles.

Proof: If you have a cycle (i.e., you are able to start in some room, walk through a sequence of distinct doors and return back to the room), you can remove any one door from the cycle. You will still be able to reach everywhere and the new solution will be cheaper.

Observation 2: It is always optimal to build all horizontal (i.e., cost=1) doors.

Proof: Suppose you have a solution in which you don’t do that. Pick any one of the horizontal doors it does not contain and build it. This must create a cycle. (If the door connected rooms X and Y, the solution we started with contains some path from X to Y, and that path + the new door = the cycle.) That cycle must contain some vertical (cost=2) doors. We can remove any one of those doors from the cycle to get a new valid solution that’s cheaper than the original one.

Thus, we can start by building all possible horizontal doors. This will create some connected components of rooms – at this point, each such component is simply a contiguous part of a single row.

Now all that remains is to build as many vertical doors as necessary. Due to the input being small, we can afford to do this inefficiently. Imagine that we color the rooms according to the component they are currently in. We can then simply do the following: As long as there are two rooms with a different color that can are above one another, build the door between them and recolor one of the two components to match the color of the other.

The proof that the above approach always works (regardless of which vertical doors we build and in which order) will actually give us an even simpler solution to the original problem.

For the proof, note that:

  • We know X = the starting number of connected components: this is simply the total number of contiguous segments of rooms within rows.
  • We know Y = the final number of connected components: these are simply the connected components of rooms in the traditional sense.
  • We can observe that adding any vertical door merges two components into one, and thus it decreases the number of connected components by 1.

Hence, regardless of how we build the vertical doors, we will eventually build exactly Y-X of them.

And this is then our simplest solution to the problem:

  • Scan the plan of the building to count the number H of horizontal doors and the number X defined above.
  • Use any graph traversal (e.g., BFS or DFS) to count Y.
  • The answer we should return is 1*H + 2*(Y-X).

The implementation shown below uses the O(N^2) version of the Jarník-Prim general MST algorithm, where N=R*C is the number of cells in the building.

int encode(int R, int C, int r, int c) {
    return r*C + c;
}

public int build(String[] plan) {
    int R = plan.length, C = plan[0].length();
    int N = R*C;

    boolean[] alive = new boolean[N];
    for (int r=0; r<R; ++r) for (int c=0; c<C; ++c) {
        if (plan[r].charAt(c) != '.') continue;
        int x = encode(R,C,r,c);
        alive[x] = true;
    }

    int[][] G = new int[N][N];
    for (int a=0; a<N; ++a) for (int b=0; b<N; ++b) G[a][b] = (a==b ? 0 : 999);
    for (int r1=0; r1<R; ++r1) for (int c1=0; c1<C; ++c1) {
        for (int r2=0; r2<R; ++r2) for (int c2=0; c2<C; ++c2) {
            if (Math.abs(r1-r2) + Math.abs(c1-c2) != 1) continue;
            if (plan[r1].charAt(c1) != '.') continue;
            if (plan[r2].charAt(c2) != '.') continue;
            int x1 = encode(R,C,r1,c1), x2 = encode(R,C,r2,c2);
            G[x1][x2] = G[x2][x1] = (r1 == r2 ? 1 : 2);
        }
    }

    int answer = 0;

    boolean[] processed = new boolean[N];

    for (int n=0; n<N; ++n) if (alive[n] && !processed[n]) {
        int[] distance = new int[N];
        for (int a=0; a<N; ++a) distance[a] = 999;
        distance[n] = 0;
        while (true) {
            int curr = -1;
            for (int a=0; a<N; ++a) {
                if (!alive[a]) continue;
                if (processed[a]) continue;
                if (curr == -1) curr = a;
                if (distance[a] < distance[curr]) curr = a;
            }
            if (curr == -1) break;
            if (distance[curr] == 999) break;
            answer += distance[curr];
            processed[curr] = true;
            for (int a=0; a<N; ++a) {
                distance[a] = Math.min( distance[a], G[curr][a] );
            }
        }
    }
    return answer;
}

SortTwoArrays

If we are sorting just one array of length N, we can always do it by making at most N-1 swaps. (E.g., as follows: take the biggest element and swap it with the last one, then the second biggest with the second-to-last one, and so on. Once the second smallest element is in place, the smallest one must be as well.)

One possible strategy when sorting two arrays would be to find a way to swap two elements of one array without modifying the other, and then using the above algorithm on each array separately. It turns out that this can be done but it would require too many steps. (If we want to swap A[x] and A[y], we can swap A[x] with B[0], then A[y] with B[0], and then A[x] with B[0] again. As we need three actual swaps to execute one swap within A, we would need 2*3*(N-1) swaps in the worst case when sorting both A and B.)

So, we have to come up with something different. The key thing we did not use in the previous approach: we aren’t required to keep all the elements in their original arrays!

One simple idea we can use is to simply use the same algorithm as above, but on both arrays at once. More precisely, we repeatedly:

  • find the two largest elements
  • move one of them to the end of A and the other to the end of B
  • forget about them and pretend our two arrays now only have length N-1

This approach clearly works: all elements that are left in the arrays are at most as big as the two we just processed, and thus regardless of how we rearrange the rest, the last two elements will never cause an issue.

Can we make this approach fast enough? Sure we can, as follows:

  • find the largest element
  • use one swap to move it to the end of the opposite array
  • find the second largest element (i.e., the largest among the other 2N-1)
  • use at most two swaps to move it to the end of the other array

For example, if the largest element is at A[7], we swap it with B[N-1]. Then, if the second largest element is at B[3], we just swap it with A[N-1], and if the second largest element is at A[3], we swap A[3] with B[0] and then B[0] with A[N-1].

This algorithm can clearly sort any pair of N-element arrays using at most 3*(N-1) swaps.

int findMax(int[] arr, int prefix) {
    int answer = 0;
    for (int i=1; i<prefix; ++i) if (arr[i] > arr[answer]) answer = i;
    return answer;
}

void swap(int[] A, int[] B, int a, int b, ArrayList<Integer> swaps) {
    int tmp = A[a];
    A[a] = B[b];
    B[b] = tmp;
    swaps.add(a);
    swaps.add(b);
}

public int[] twoSorts(int N, int[] A, int[] B) {

    ArrayList<Integer> swaps = new ArrayList<Integer>();

    for (int p=N-1; p>=1; --p) {
        // find the two largest elements and move them to the ends
        int a = findMax(A, p+1), b = findMax(B, p+1);
        if (A[a] > B[b]) {
            swap(A, B, a, p, swaps);
            // now we have the largest element at the end of B, and we want 
            // the second-largest at the end of A
            int a2 = findMax(A, p+1);
            int b2 = findMax(B, p);
            if (A[a2] > B[b2]) {
                swap(A, B, a2, 0, swaps);
                swap(A, B, p, 0, swaps);
            } else {
                swap(A, B, p, b2, swaps);
            }
        } else {
            // the exact same thing as above, just with A and B swapped
            swap(A, B, p, b, swaps);
            // now we have the largest element at the end of A, and we want 
            // the second-largest at the end of B
            int a2 = findMax(A, p);
            int b2 = findMax(B, p+1);
            if (A[a2] > B[b2]) {
                swap(A, B, a2, p, swaps);
            } else {
                swap(A, B, 0, b2, swaps);
                swap(A, B, 0, p, swaps);
            }
        }
    }

    int[] answer = new int[swaps.size()];
    for (int i=0; i<swaps.size(); ++i) answer[i] = swaps.get(i);
    return answer;
}

MostPeriodic

Other approaches exist. We’ll present a solution for which it’s easy to argue its correctness.

Suppose we want to check whether the given collection of letters can be used to build a string with period P. Once we fix the value P, the letters of the desired string can be divided into equivalence classes: for each i between 0 and P-1, inclusive, all letters at indices of the form i+k*P must be identical. Each of these equivalence classes has either floor(N/P) or floor(N/P)+1 elements, and we know that there are exactly N mod P longer ones and N-that shorter ones. All that remains is to check whether we can take the given collection of letters and split them among these equivalence classes.

We can do this check in a reasonably efficient way by checking all possibilities – while using memoization / dynamic programming to make sure we aren’t processing the same state more than once.

The state of our search is a triple:

  • how many letters have we already processed
  • how many “long” and how many “short” equivalence classes do we still need to cover

In order to process the next letter, we look at the number of occurrences it has in our collection, we iterate over all ways how to split it into exactly some number of long and some number of short classes, and each of those options (if any) will lead us to a new state with one more letter processed.

Once the DP is finished, we know whether a solution exists – whether we reached the state where all letters have been processed and all equivalence classes were covered. If we did, we can now construct a valid solution in the standard way: if we store (or recompute on demand) pointers saying which state was first reached from which earlier state, we can now go backwards from the fully solved state and reconstruct a sequence of steps that got us there.

Time complexity estimate: We’ll start with an asymptotic upper bound.

The DP has (26+1)*(long+1)*(short+1) states. For a fixed P we have long+short=P, and thus (long+1)*(short+1) is clearly O(P^2). 

When processing a single letter with X occurrences, we can iterate over all possible numbers of long equivalence classes. This will take O( X/(N/P) ) steps. Summing this over all letters gives us a total of O( N/(N/P) ) = O(P) steps. Thus, the total time spent on evaluating all states of the form (*,l,s) for any pair (l,s) is O(P). Hence, a specific value of P can be checked in O(P^3) time, which means that the full solution that tries P=1,2,…,N is guaranteed to run in O(N^4) time.

N^4 may seem like a bit too much, but the solution is fast enough because the asymptotic estimate gives only a very loose upper bound. The main reasons why it’s loose:

  • The number of long and short equivalence classes is almost never split down the middle, so the actual number of states is usually lower.
  • If we consider only letters with a non-zero number of occurrences, there are actually only at most (long+1)*(short+1) reachable states total, without the 27 factor. After we processed some letters, we know the total number of characters we still have left to process and only the states with that exact total number of missing letters are reachable.
  • Once we get to P >= N/2 + 13, we are guaranteed to find a solution. At that point all equivalence classes have at most two elements, and we have at least 26 one-element classes so we can be sure that all the letters we have can be distributed among the equivalence classes, regardless of count parities.

We can now, for any fixed N, simply iterate over all viable P, each time checking the actual number of long and short equivalence classes, and thus DP states we get. N=777 is indeed the worst case, and in this case we’ll get a total of fewer than 3,300,000 DP states (in total over all viable P). This is still just an upper bound, as out of those many still won’t be reachable.

An additional significant speedup (that’s not necessary for the solution to run in time but can improve the running time significantly): once you fix P, for each letter do one brute force pass to find whether all its occurrences can be split into the current long and short equivalence classes, and if yes, store all ways to do so, so that we later iterate only over those when processing each state. This will immediately eliminate many values P, as some letter will have an incompatible number of occurrences, and it will significantly speed up the processing of other Ps.

WeighCoins

Let’s start with the old weighings. In each of them we were comparing two batches consisting of the same number of coins. These “sensible weighings” are also the only type of weighings we should be making in the future. If you weigh more coins on one scale against fewer on the other and you get the answer “more is heavier”, you just learned nothing at all about the coins: the counterfeit coin can be any of the N coins, and it can be both slightly heavier and slightly lighter than the regular ones, in all those 2*N settings the outcome of the weighing will be exactly the same. A worst-case-optimal solution can never make a question that has a branch in which it learns nothing new.

There are two types of sensible weighings: one that returns equality and one that discovers an inequality.

Equality is easy to process: we simply learned that all coins involved in the comparison are real.

If all the previously made weighings resulted in equality, the starting state of our search is that we know that the counterfeit coin is among those we never touched, and that it can be both heavier and lighter than a real one. These states will be called “type 1 states”. The only parameter of a type 1 state is the number of coins that are still candidates for the counterfeit one.

How do we process a weighing that reported an inequality? Without loss of generality, suppose the left scale is lighter than the right one. Then:

  • All coins other than those on the scales are real, and
  • either one of the coins on the left scale is a counterfeit lighter than a real coin, or
  • one of the coins on the right scale is a counterfeit heavier than a real coin.

As soon as we encounter our first inequality, we are now in a “type 2 state”. In such a state we no longer have any coins that can be both counterfeit-heavy and counterfeit-large. Each coin is either known to be real, or it can be light-but-not-heavy, or it can be heavy-but-not-light. Each type 2 state has therefore two parameters: the number of light and the number of heavy candidates left.

(So far we haven’t stated this explicitly, so we insert this information here: by this point the reader was expected to realize that the optimal solution to any state does not depend on which exact coins have which type, only on the total numbers of coins of each type. E.g., the state where we know that one of the coins A,B,C is a lighter-than-real counterfeit has clearly the same solution as the state where we know this about the coins T,C,O instead. Coin labels don’t matter, both situations are the same state: “we have three ‘light candidates’ and no ‘heavy candidates’ among our coins”.)

We can now process the input using the above rules. While we do so, we keep two flags for each coin: whether it can still be light and whether it can be heavy. Once we finish going through the given weighings, we then check whether we are in a type 1 or a type 2 state, and we count the corresponding parameter(s).

Now we are going to use dynamic programming to find the optimal solution for all possible states. As we already noted, once we are in a type 2 state we remain in a type 2 state, while from a type 1 state we can either reach another type 1 state, or we can reach a type 2 state. Thus, we will first solve all type 2 states and then all type 1 states.

In order to solve a specific state, we can simply look at all reasonable ways of making the next weighing. Each of them will give us some new states (that are closer towards finding the answer), among those we pick the worst one (we are optimizing the worst case of our algorithm), and among the possible next weighings we then pick that one that has the best possible worst outcome.

What are the reasonable ways to make a weighing? Here’s one simple optimization that saves an order of complexity: it does not make sense to put known real coins onto both scales – removing one from each scale does not change the outcome. Thus, in a reasonable weighing one scale only contains candidates (some light, some heavy) while the other may contain some candidates and some real coins. We can now iterate over all these possibilities: we start with the number of coins for each scale, and then we iterate over the number of light candidates on the left scale, the number of real coins on the right scale, and the number of light candidates on the right scale. For each weighing we consider each of the three possible outcomes, for each of those we look at its worst-case optimal solution, and take the maximum of those three values (again, worst case) plus one (the weighing we just did) as a candidate answer for the current state.

Be careful: usually the new states are closer towards being solved, but it’s also possible for a new state to be the same as the current state (e.g., if we repeat a weighing we already did, we learn nothing new), so make sure to handle this in your code correctly.

Solving a type 1 state is essentially the same but simpler: we just iterate over all options for the number of coins weighed and the number of known-real coins we should put on the right scale along with the candidate ones.

With C coins there are O(C) type 1 states and O(C^2) type 2 states. For each of type 2 states we try O(C^4) ways to make the next weighing, thus our solution has a theoretical upper bound O(C^6). There is an obvious worst case (starting with the most coins and knowing nothing), so it’s easy to verify that the actual constants are small enough for this solution to run in time.

For some more precise math, the first time we make a weighing that discovers an inequality, we are left with at most N/2 <= 26 candidates of each type. Even if all those type 2 states were reachable, we’ll spend at most 26^4 steps for each of them, giving us a loose upper bound of about 300M steps only. The bound is loose because most of the states are smaller and the amount of work to solve a case drops off very quickly as its size decreases. Actually running the empty nested cycles to compute all viable pairs (type 2 state, potential weighing for that state) would tell us that the exact count of those is much smaller: fewer than 5M such pairs exist.

Code to process all type 2 states:

// for each sum from 0 to N, process all states with exactly sum candidates
// out of those, LL are potentially-light and HH potentially-heavy
for (int sum=0; sum<=N; ++sum) for (int LL=0; LL<=sum; ++LL) {
    int HH = sum-LL;
    if (LL+HH <= 1) { memo2[LL][HH] = 0; continue; }

    // set the answer to this case to infinity
    // (note that we may refer back to this value during its computation,
    // but whenever we do so, we won’t improve it)
    memo2[LL][HH] = 987654321;

    // what’s the maximum number of coins we consider putting on each scale
    int maxOnEach = Math.min( N/2, LL+HH );

    // what’s the maximum number of known-real coins available
    int maxReal = N-LL-HH;

    // iterate over all #s of coins on one scale
    for (int use=1; use<=maxOnEach; ++use) {
        // for the left scale, iterate over all #s of light candidates
        // and use heavy candidates for the rest
        for (int l1=0; l1<=LL; ++l1) {
            int h1 = use-l1;
            if (h1 < 0 || h1 > HH) continue;

            // for the right scale, iterate over all #s of light and heavy
            // candidates and use real coins for the rest
            for (int l2=0; l1+l2<=LL; ++l2) for (int h2=0; h1+h2<=HH; ++h2) {
                int real = use - l2 - h2;
                if (real < 0) continue;
                if (real > maxReal) continue;

                // look at the three possible outcomes of the current
                // weighing and pick the worst option plus one
                int curr = 0;
                curr = Math.max( curr, memo2[LL-l1-l2][HH-h1-h2] );
                curr = Math.max( curr, memo2[l2][h1] );
                curr = Math.max( curr, memo2[l1][h2] );
                memo2[LL][HH] = Math.min( memo2[LL][HH], 1+curr );
            }
        }
    }
}

misof


categories & Tags


Close

Sign up for the Topcoder Monthly Customer Newsletter

Thank you

Your information has been successfully received

You will be redirected in 10 seconds