Match Editorial

SRM 416
Thursday, September 4, 2008

Match summary

Division 1 saw a hard problem that seemed more approachable than usually. After all, everything was given in the problem statement, the contestants "only" had to implement it. However, the problem turned out to be worth each and every of those 1000 points, and in the end only two coders (Petr and ACRush) were able to submit flawless solutions – and claimed the top two spots, in this order. The third place went to Smylic, thanks to a solid time on the 500 and two challenges.

In Division 2, more than twenty coders successfully solved the hard problem, and thus the battle for top spots came down to solving time and challenges. In the end, the top three were snomak, iliyafilippov, and mvolke. The best newcomer, ghostgold, placed thirteenth.

The Problems

MostCommonLetters

Used as: Division Two - Level One:

Value 250

Submission Rate 612 / 675 (90.67%)

Success Rate 534 / 612 (87.25%)

High Score snomak for 248.31 points (2 mins 21 secs)

Average Score 200.51 (for 534 correct submissions)

Even with tasks like this one, it usually pays off to be systematic and break the solution into smaller steps. In our solution, we start by computing the number of occurrences for each of the 26 letters. Once we have these 26 values, we can easily compute their maximum. Now, we can process the 26 letters in alphabetical order, and each time we encounter one with a number of occurrences equal to the maximum, we add it at the end of the result.

  string listMostCommon(vector <string> text) {
    // count the occurrences of each letter
    vector<int> occurrence_count(26,0);
    for (unsigned i=0; i<text.size(); i++)
      for (unsigned j=0; j<text[i].size(); j++)
        if (isalpha(text[i][j]))
          occurrence_count[ text[i][j]-'a' ]++;

    // find the maximum count
    int maximum_count = -1;
    for (int i=0; i<26; i++)
      maximum_count = max(maximum_count, occurrence_count[i]);

    // process all letters in order and list the most frequent ones
    string res = "";
    for (int i=0; i<26; i++)
      if (occurrence_count[i]==maximum_count)
        res += char('a'+i);
    return res;
  }

NextNumber

Used as: Division Two - Level Two:

Value 500

Submission Rate 513 / 675 (76.00%)

Success Rate 139 / 513 (27.10%)

High Score togi_ for 490.74 points (3 mins 55 secs)

Average Score 297.53 (for 139 correct submissions)

Used as: Division One - Level One:

Value 250

Submission Rate 572 / 581 (98.45%)

Success Rate 452 / 572 (79.02%)

High Score Knio for 249.07 points (1 mins 44 secs)

Average Score 198.37 (for 452 correct submissions)

Let S(X) be the binary representation of the number X, that is, a string of ones and zeroes. For example, S(18)=10010 and S(24)=11000.

If we take two numbers A<B with an equal number of binary digits, it can easily be seen that S(A)<S(B) – the most significant binary digit where A and B differ corresponds to the leftmost character where S(A) and S(B) differ.

Except for one special case, the problem statement can now be rephrased as follows: Take the string S(N). Out of all permutations of this string (these are the binary representations of other numbers with the same weight), find the one that is next in the lexicographic order.

The special case we mentioned above is the case where S(N) is of the form 11...100...0. In this case, the next number with an equal number of ones will have one more decimal digit.

An elegant way to get rid of this special case is to always start the string S(X) with a leading zero. For example, S(18) will be 010010 and S(24) will be 011000. Now the problem statement simply becomes: Find the next permutation of the given string of zeroes and ones.

In C++, we can use the STL function next_permutation to do all of our work.

  int getNextNumber(int N) {
    // break N into bits
    vector<int> bits(32,0);
    for (int i=0; i<32; i++)
      if (N & 1<<i)
        bits[31-i] = 1;

    // find the next permutation of this many 0s and 1s
    next_permutation( bits.begin(), bits.end() );

    // convert the bits back into a number
    int res = 0;
    for (int i=0; i<32; i++)
      if (bits[i])
        res += 1<<(31-i);
    return res;
  }

Alternately, we could do all the work ourselves. The general next_permutation algorithm is described, for example, in previous editorials. However, in the case of binary strings the algorithm can be simplified as follows:

Find the first 0 from the right that is followed by a 1. Clearly, in the next number, this position already has to contain a 1, and the positions to the left of it will remain unchanged.
Place a 1 at the position found.
Fill in the remaining positions from the left to the right, first using all the remaining zeroes, then all the remaining ones. (This produces the smallest such number.)

For example, consider the string 01011000. This process will look as follows:

step 1:   the position we seek is this one: 01*.....
step 2:     this position must contain a 1: 011.....
step 3a: fill in the four remaining zeroes: 0110000.
step 3b:     fill in the one remaining one: 01100001

This approach is implemented in many of the top contestants' solutions: first, second, and can even be reduced to a few bitwise operations.

DancingCouples

Used as: Division Two - Level Three:

Value 1000

Submission Rate 119 / 675 (17.63%)

Success Rate 24 / 119 (20.17%)

High Score iliyafilippov for 890.91 points (10 mins 11 secs)

Average Score 559.91 (for 24 correct submissions)

Probably the most important aspect of this task was to realize how big the output value can be. It is tempting to deduce that the worst case is when we want to make 10 pairs of boys and girls that like each other – where the answer is 10! = 3,628,800. However, this is not true. The worst case is when we want 8 pairs. We can select the 8 boys in choose(10,8) = 45 ways, and assign them the girls in 10*9*8*7*6*5*4*3 = 1,814,400 ways, for a total of 81,648,000 ways how to assign the dancers.

Actually, this number is still reasonably low, and I believe that a heavily optimized program with time complexity linear in the answer could be able to finish within the time limit. However, there was a better way, and all the top finishers in division 2 realized this. The brute force search that considers all possible pairings will often be solving the same subproblems. For example, regardless of whether we have pairs Alice-Bob and Eve-Oscar or pairs Alice-Oscar and Eve-Bob, the number of ways how to make other three couples is in both cases the same.

Each subproblem is clearly defined by three variables: the set of unmatched boys, the set of unmatched girls, and the number of couples we need to make. This would give us 2¹⁰ * 2¹⁰ * 10 = approximately 10 million states.

There is still one more easy improvement: we can process the boys in order. In that case, there will only be at most 11 possible sets of unmatched boys – for each K, the set that contains the first K boys. In this way, we reduced the number of states to 11 * 2¹⁰ * 10 = approximately one hundred thousand, which is low enough for our solution to be reasonably fast.

In the program below, we represent the set of available girls as an integer from 0 to 1023, where bit X is set iff girl X is still available.

  int memo[12][1030][12];
  vector<string> canDance;

  int solve(int boys, int available_girls, int couples) {
    // handle the trivial cases
    int girls = __builtin_popcount(available_girls);
    if (couples>boys || couples>girls) return 0;
    if (couples==0) return 1;

    // if we already solved this case, return the memoized answer
    if (memo[boys][available_girls][couples] >= 0)
      return memo[boys][available_girls][couples];

    // in the general case, first try not to match the last boy...
    int result = solve(boys-1, available_girls, couples);

    // ... and then try all valid pairs for the last boy
    for (int girl=0; girl<canDance[boys-1].size(); girl++)
      if (canDance[boys-1][girl]=='Y')
        if (available_girls & 1<<girl)
          result += solve(boys-1, available_girls ^ (1<<girl), couples-1);

    // store and return the result
    memo[boys][available_girls][couples] = result;
    return result;
  }

  int countPairs(vector <string> cD, int K) {
    // initialize the global variables
    canDance = cD;
    memset(memo,-1,sizeof(memo));

    // recursively solve the task
    return solve(canDance.size(), (1<<canDance[0].size())-1, K);
  }

CustomDice

Used as: Division One - Level Two:

Value 500

Submission Rate 147 / 581 (25.30%)

Success Rate 119 / 147 (80.95%)

High Score ainu7 for 443.76 points (10 mins 23 secs)

Average Score 270.84 (for 119 correct submissions)

First, note that the problem statement contains the following information: For any set of 6 distinct integers, there are exactly 30 ways how to place them on a dice. Thus it is enough to count the sets of integers that satisfy the requirement, and at the end multiply the result by 30.

Second, the condition "the average must not exceed M" can be rephrased as "the sum must not exceed 6M". In this way, we got rid of any non-integer operations. From this point on, all the numbers in this solution will be integers.

We now want to count the number of sets {x₁, ..., x₆} such that:
0 < x₁ < ... < x₆
x₁ + ... + x₆ ≤ 6M

The dynamic programming solution that will be presented below can directly be applied at this point. However, I prefer to simplify the problem first.

By substituting y_i=x_i-i we get an equivalent problem: Count the number of sequences (y₁, ..., y₆) such that:
0 ≤ y₁ ≤ ... ≤ y₆
y₁ + ... + y₆ ≤ 6M-21

At this point it is obvious that for M≤3 there are no solutions. Now, we can get rid of the condition y_i≤y_i+1 as well.

Let:
z₁=y₆-y₅,
z₂=y₅-y₄,
...
z₅=y₂-y₁,
z₆=y₁

In ASCII graphics, this substitution can be visualised as shown below. Note that sum(y_i) = sum(i × z_i).

y1: +---------+
y2: +------------+
y3: +------------------+
y4: +----------------------+
y5: +----------------------------+
y6: +---------------------------------+

    +---------+--+-----+---+-----+----+
        z6     z5   z4   z3   z2   z1

We get another equivalent problem: Count the number of sequences (z₁, ..., z₆) such that:
0 ≤ z₁, ..., z₆
z₁ + 2z₂ + ... + 6z₆ ≤ 6M-21

At this point, there is no obvious way how to simplify the formula even further. (However, in many similar tasks there might be one, which is why we present this approach.)

We can now count the number of valid sequences as follows. There are two types of valid sequences: Those where z₆>0 and those where z₆=0. In the first case, we can subtract 1 from z₆, and get the same problem with the right side of the inequality smaller by 6. In the second case, we get a similar problem with only 5 variables.

This can be formulated as a recurrence relation. Let cnt[v][s] be the number of ways in which we can set variables z₁ to z_v so that the total (z₁ + ... + vz_v) does not exceed s. Then we have: cnt[v][s] = cnt[v-1][s] + cnt[v][s-v].

After adding proper boundary conditions (and limiting the program to only use two rows of the dynamic programming table to save memory) we get the code below:

  int MOD = 1000000007;
  int cnt[2][6000047];

  int countDice(int M) {
    M = 6*M-21;
    if (M<0) return 0;
    memset(cnt,0,sizeof(cnt));
    for (int s=0; s<=M; s++) cnt[0][s]=1;
    for (int v=1; v<=6; v++)
      for (int s=0; s<=M; s++) {
        cnt[v%2][s] = cnt[1-v%2][s];
        if (s >= v) cnt[v%2][s] += cnt[v%2][s-v];
        cnt[v%2][s] %= MOD;
      }
    return (30LL*cnt[0][M])%MOD;
  }

RussianCheckers

Used as: Division One - Level Three:

Value 1000

Submission Rate 10 / 581 (1.72%)

Success Rate 2 / 10 (20.00%)

High Score Petr for 640.18 points (24 mins 24 secs)

Average Score 556.71 (for 2 correct submissions)

Everything we need is spelled out in the problem statement. Clearly, the number of possible moves will always be small, and thus we can use backtracking to generate the list of all possible moves. The difficulty in this problem lies in finding a way how to implement all the rules easily, with as little effort as possible, and at the same time to make sure that we didn't forget any of the rules.

Some important rules and observations that were easy to overlook:

When the "king" moves zero or more additional squares after jumping over the opponent, it must stop at a square from which it can perform another capture if possible.
If a "man" reaches the far side of the board as the result of a capturing move, it becomes a "king" and must continue capturing if possible, but as a "king".
We can not remove captured pieces at the moment when we capture them, we shall only mark them as captured. Otherwise we will get illegal moves, e.g., a king on c3 with opponents at b2 and d4 is not allowed to jump c3:a1:e5.
We can visit the same cell twice during a move: a man at g1 with opponents at f2, d2, b2, b4, d4, and f4 is allowed to jump g1:e3:c1:a3:c5:e3:g5, visiting e3 twice. Thus we need to mark opponent's pieces as captured, marking visited cells does not work.
The first capturing move of a man is allowed to be backwards. (In this aspect the problem statement differs from checkers rules I'm used to, in the contest this is where I would most probably make a mistake.)

Suggestions how to implement things nicely:

Obviously, we can process our pieces one at a time.
Have separate functions for generating simple moves and capturing moves. Only if the second one returns nothing for each of our pieces, start calling the first one.
Have a global array stating which pieces are captured at the moment.
Write a single recursive function to look for capturing moves, with a boolean flag whether the piece is a king as one of the arguments. Note that this flag may change during the search.

Petr's solution is as readable as it gets.

By misof
TopCoder Member