TCO21 Algorithm Round 1A Editorials
250: EllysBalancedStrings
The problem EllysBalancedStrings presented a fairly easy task to start the TopCoder Open 2021 Algorithm track. One had to make the observation that we should just count the number of vowels and consonants, see which are more, and change them into the opposite type (convert vowels to consonants if the number of vowels is greater than the number of consonants, and consonants to vowels otherwise). It should be fairly obvious that it doesn't make sense to convert a vowel to a consonant if there are more consonants, and vice-versa – a consonant to a vowel, if there are more vowels.
Now the question is which of the letters to increment/decrement? Since we have no requirement what the resulting string must be, we should approach this greedily – pick the letters that take fewest operations to convert.
All vowels can be converted to consonants in a single operation – actually, a single increment: 'A' into 'B', 'E' into 'F', 'I' into 'J', 'O', into 'P', 'U' into 'V'. Thus, whenever we get an input with more vowels than consonants, the answer is always the difference between the number of vowels and the number of consonants, divided by two.
Consonants are slightly trickier, as some of them require more than one operation (for example, 'C' requires two operations (either decrements, to get an 'A', or increments, to get a 'E'). Some consonants require more operations than others – for example 'X' requires three decrements to get to 'U', and 'Z' requires 5. Thus, what we do is calculate for each of the letters we have, how many operations we need to change it into the opposite type, then sort them by operations, and use the letters that require the fewest operations.
For example, let's look at the example "SYZYGY". We have six consonants, thus have to convert three of them into vowels. 'S' requires 2 operations (incrementing it to 'U'). 'Y' requires 4 operations (decrementing it to 'U'), 'Z' requires 5 (again, decrementing to 'U'), and 'G' requires 2 (either decrementing to 'E' or incrementing to 'I'). Having those sorted, we get the array [2, 2, 4, 4, 4, 5]. We take the three letters requiring the least number of operations, thus 2 + 2 + 4 = 8.
We need to take care to not use vowels, but other than that, there is not much to get wrong here. We also need to either write a function, which gives us the minimal number of operations to get a letter of the opposite type, or just enter this array manually (as there are only 26 letters and it's quite simple to do this by hand).
The following code shows one way to implement the described solution:
private int[] changes = { 1, 1, 2, 1, 1, 1, 2, 1, 1, 1, 2, 3, 2,
1, 1, 1, 2, 3, 2, 1, 1, 1, 2, 3, 4, 5 };
private boolean isVowel(char ch) {
return ch == 'A' || ch == 'E' || ch == 'I' || ch == 'O' || ch == 'U';
}
public int getMin(String S) {
List<Integer vow = new ArrayList<
();
List<Integer con = new ArrayList<
();
for (int i = 0; i < S.length(); i++) {
if (isVowel(S.charAt(i))) {
vow.add(changes[S.charAt(i) - 'A']);
} else {
con.add(changes[S.charAt(i) - 'A']);
}
}
vow.sort(Comparator.naturalOrder());
con.sort(Comparator.naturalOrder());
int ans = 0;
int idxVow = 0, idxCon = 0;
while (con.size() + idxVow < vow.size() - idxVow)
ans += vow.get(idxVow++);
while (vow.size() + idxCon < con.size() - idxCon)
ans += con.get(idxCon++);
return ans;
}
The complexity of this problem is O(|S| * log(|S|)), where |S| is the length of the input S. We could, in theory, get it to O(|S|) if using counting sort; however, with such low constraints that was an overkill.
500: EllysTwinLetters
The second problem of the set presented a different kind of challenge – one, that more experienced coders would immediately recognize as Dynamic Programming.
In this case, the DP was quite easy; however, if you've never faced one it may still be a hard problem for you. Take a look at the TopCoder tutorials on the subject: https://www.topcoder.com/community/competitive-programming/tutorials/dynamic-programming-from-novice-to-advanced/
So, what we needed to do? One intuitive (but wrong) way to approach the problem was to greedily select letters. On example greedy solution would be to change all letters without a twin to be the same as one of their neighbors (either the left one, or to the right one). From them, choose the one with fewer operations needed.
This greedy, however (as most greedy solutions for this problem, actually) doesn't catch the case when we change a single letter to "fix" two twin letters at the same time. For example, let's have the string "AADIDAA". The 'A's already have a twin, thus we don't need to change them. The 'D' is closer to the 'A' (3 operations) than to the 'I' (4 operations), thus we would make it an A. Then We'd make the I be a 'D' (for another 4 operations) and we'd have the string "AAADDAA", which is of the correct type. However, we can change the 'I' in the middle to be a 'D' with 4 operations, and then the two 'D's will have a twin and wouldn't need to change them. Thus, we found a "cheaper" answer.
So, as mentioned above, what we need to do is a solution, based on a dynamic programming approach. Our state would have a single dimension (easiest possible DP). This dimension would tell us what is our current index of the string (we'll assume all letters in the previous position have a twin and we won't care about them at all).
So, being at index idx, we need to make sure it has a twin, thus we should fix what character it will be and then change the next letter to be the same. Actually, as we mentioned above, we can have a sequence of more than two "twins", thus after fixing what letter the character at the current index would be, we should also decide how many letters after it will be the same. Since we don't know what is the optimal letter, and we don't know how many after it should be the same, we just try all options! That's the power of the DP, it allows us to try everything, without costing us high runtime.
So, the recursive function for the dynamic programming could look something like that:
private int recurse(int idx) {
if (idx = S.length())
return 0;
if (dyn[idx] != -1)
return dyn[idx];
int ans = S.length() * 26;
for (char target = 'A'; target <= 'Z'; target++) {
int curDist = Math.abs(S.charAt(idx) - target);
for (int end = idx + 1; end < S.length(); end++) {
curDist += Math.abs(S.charAt(end) - target);
ans = Math.min(ans, curDist + recurse(end + 1));
}
}
return dyn[idx] = ans;
}
Here, dyn[] is our "dynamic table" – an array with already calculated answers. It is initialized with -1 in the beginning (since that's an invalid answer and we can use it as a marker whether a state is calculated or not). Whenever a state is not yet calculated, we have a loop to fix what letter we'll have at the current position, and another loop how many letters after it will be the same. Since we try every possible option, we are guaranteed that one of them would be the optimal one – thus, we'll find the best possible answer.
The complexity of this approach is O(N^2 * 26) – we have O(N) states, and O(N * 26) complexity for each state. This problem can actually be solved faster (using two-dimensional dynamic table) – can you figure out how? Hint – the state would be dyn[N][26] (thus, the state would be O(N * 26)) and the complexity for each state would be O(26).
Note that I'm abusing the Big-O notation here, as there is no such thing as O(26) – that's O(1) – but as N is quite small, I'm doing it to hint which operations contribute to the runtime of our solution.
1000: EllysRansom
The third problem of the set was much harder than the previous two. To be fair, the set-up looks quite "standard", so I was wondering if some similar problem was given before, but couldn't find one. Well, with tens, if not hundreds of thousands of programming problems given, I guess it's harder and harder for problem setters to prevent that :shrug:
Okay, let's examine the problem. We need some letters (given in the string T) and we have some options how to get them (A[i] or B[i]). Whenever we have A[i] == B[i] it's obvious that it doesn't matter which one we choose. Thus, we handle all those cases first.
Sometimes we also have both A[i] and B[i] containing a character we don't need (it is either not present in T at all, or we've already got all the letters we need from the previous case.
If only one of A[i] or B[i] is needed (and the other one is not), it's also obvious that we can take it.
So, for the rest of the problem we'll consider that all (remaining) positions have A[i] != B[i] and both A[i] and B[i] are needed.
Here greedy also looks promising (and, in contrast to the previous problem, is closer to actually achieving optimal answer). It is still wrong, though (and we've tried creating tests that break greedy solutions).
So what we must do is actually use a network flow (https://www.topcoder.com/thrive/articles/Maximum%20Flow:%20Part%20One). The network we'll create will have four "layers":
The source (a single node);
The positions in A/B (N nodes);
The letters from the English alphabet (26 nodes);
The sink (a single node).
The source will have an infinite supply (as it is often the case in network flow problems). It will have an edge to each node in the second layer (the positions in A/B) with capacity 1 (we can, after all, use each position at most once). Each position will, in turn, have an edge to two of the letters in the third layer (the letters A[i] and B[i], respectively). The nodes from the third layer (the letters) will each have an edge to the sink, with capacity equal to the number of times the respective letter is contained in T (that's how many times we want to "cut it").
Running a flow in this network, we'll either get flow less than |T| (in which case the answer is "NO SOLUTION"), or, if the flow will be exactly |T|. We should see how the algorithm we've chosen has selected the letters and reconstruct them. In general, if we have a flow from the first layer to the second, then we have used the respective position; otherwise we have not, thus we'll have a '_' in the resulting string. Then, checking the flow from the second layer to the third, we can see which letter we have used on each position.
Using one relatively simple implementation (as the constraints were low enough to not require any of the "fancy" ones), the code can look as follows:
private final int MAX = 1033;
List<Integer[] v;
private boolean[] vis;
private int[][] capacity;
private void addEdge(int from, int to, int cap) {
v[from].add(to);
v[to].add(from);
capacity[from][to] += cap;
}
private boolean recurse(int node, int target) {
vis[node] = true;
if (node == target)
return true;
for (int i = 0; i < v[node].size(); i++) {
if (!vis[v[node].get(i)] && capacity[node][v[node].get(i)] 0) {
if (recurse(v[node].get(i), target)) {
capacity[node][v[node].get(i)]--;
capacity[v[node].get(i)][node]++;
return true;
}
}
}
return false;
}
private int findFlow(int source, int sink) {
int flow = 0;
while (true) {
vis = new boolean[MAX];
if (!recurse(source, sink))
break;
flow++;
}
return flow;
}
public String getRansom(String A, String B, String T) {
vis = new boolean[MAX];
capacity = new int[MAX][MAX];
v = new List[MAX];
for (int i = 0; i < MAX; i++)
v[i] = new ArrayList<();
int source = A.length() + 26, sink = A.length() + 27;
for (int i = 0; i < A.length(); i++) {
addEdge(source, i, 1);
addEdge(i, A.length() + A.charAt(i) - 'A', 1);
addEdge(i, A.length() + B.charAt(i) - 'A', 1);
}
for (int i = 0; i < 26; i++) {
int need = 0;
for (int c = 0; c < T.length(); c++)
if (T.charAt(c) == 'A' + i) need++;
addEdge(A.length() + i, sink, need);
}
if (findFlow(source, sink) != T.length())
return "NO SOLUTION";
StringBuilder ans = new StringBuilder();
for (int i = 0; i < A.length(); i++) {
ans.append(capacity[source][i] 0 ? '_' :
capacity[i][A.length() + A.charAt(i) - 'A'] 0 ? B.charAt(i) : A.charAt(i));
}
return ans.toString();
}
The complexity of the given solution is O(|T| * |A|), as we have a maximum flow of |T| and each augmenting path visits at most O(|A|) nodes/edges.