|
Match Editorial |
SRM 155Thursday, July 17, 2003
Match summary
Incan Investigators Find Fraud by Deadbeat Dads in Twisty Two-Toned Trees...SnapDragon Wins Again!
SnapDragon continues his dominance, winning for the sixth
time out of the past eight challenges. Yarin was hot on his heels
until a late resubmit to fix a typo. ZorbaTHut used one of
only a handful of challenges in Division 1 to pull into second place.
In Division 2, TheFaxman finished all three problems in a
lightning quick 24 minutes, only to watch ambclams pull ahead
for the victory two minutes later.
SRM 155 marks the debut of VB as a TopCoder programming language.
A scoring bug briefly showed one VB.NET user's total score as a heart-stopping
-8000, but otherwise, the debut went smoothly. Welcome aboard VB coders!
The Problems
Quipu
Used as: Division Two - Level One:
Value
|
250
|
Submission Rate
|
197 / 227 (86.78%)
|
Success Rate
|
182 / 197 (92.39%)
|
High Score
|
kats for 248.05 points (2 mins 31 secs)
|
Average Score
|
190.82 (for 182 correct submissions)
|
In this problem, you had to determine the size of each cluster of knots by
counting consecutive 'X's, allowing for the possibility of empty clusters.
You could do this with two nested loops, with the inner loop
processing a single cluster and the outer loop walking from cluster
to cluster:
int number = 0;
for (int i = 0; i < knots.length(); i++) {
int cluster = 0;
while (knots.charAt(i) == 'X') {
cluster++;
i++;
}
number = number*10 + cluster;
}
Alternatively, you could collapse these two loops into a single loop by
incrementing every time you see an '
X' and multiplying by 10
every time you see a '
-' (except for the trailing
'
-'):
int number = 0;
for (int i = 0; i < knots.length()-1; i++) {
if (knots.charAt(i) == 'X') number++;
else number *= 10; // '-' case
}
In either version, you could also ignore the leading '
-' since
multiplying 0 by 10 has no effect.
BenfordsLaw
Used as: Division Two - Level Two:
Value
|
500
|
Submission Rate
|
130 / 227 (57.27%)
|
Success Rate
|
105 / 130 (80.77%)
|
High Score
|
TheFaxman for 446.93 points (10 mins 2 secs)
|
Average Score
|
277.27 (for 105 correct submissions)
|
According to Benford's Law, the probability that a number starts with
digit D is log10(1+1/D). Therefore, given a
sample of N numbers, you would expect E =
N*log10(1+1/D) of them to start with D.
A digit is questionable if its actual frequency is less than
E/threshold or greater than E*threshold.
You count how many times each leading digit actually occurs and compare
it to these border values.
Many people lost time by accidentally using natural logarithms
(base e) rather than base-10 logarithms. To convert from
one base to the other, you use the identity log10(x) =
ln(x) / ln(10). Another easy mistake to make was to accidentally
use integer division in calculating 1+1/D, instead of converting
to doubles.
PaternityTest
Used as: Division Two - Level Three:
Value
|
1000
|
Submission Rate
|
70 / 227 (30.84%)
|
Success Rate
|
52 / 70 (74.29%)
|
High Score
|
ambclams for 953.86 points (6 mins 18 secs)
|
Average Score
|
691.68 (for 52 correct submissions)
|
Used as: Division One - Level One:
Value
|
300
|
Submission Rate
|
123 / 136 (90.44%)
|
Success Rate
|
117 / 123 (95.12%)
|
High Score
|
shadowless for 295.24 points (3 mins 37 secs)
|
Average Score
|
227.52 (for 117 correct submissions)
|
There are two basic approaches to deciding whether a particular
man could be the father. The brute force approach is to test every
possible partitioning of characters from the child's sample between
the mother and the potential father, and accept any partitioning where
half of the characters come from the mother and half from the father.
With up to 220 possible partitionings per potential father,
and up to 5 potential fathers, there could be up to 5 million
partitionings to test. Timeout is certainly a concern in this range,
but with a modicum of care it should be possible to test all 5 million
partitionings in a few seconds.
A more efficient approach is to count how many characters in the child's
sample match the mother and how many match the potential father, allowing
for the fact that some characters could match both. If any characters match
neither the mother nor the potential father, then the man can immediately
be ruled out.
Otherwise, a valid partitioning is possible if and only if at least half the
characters match the mother and at least half match the potential father.
Why? Well, consider an example. Suppose the sample is 20 characters long,
with 12 characters matching the mother and 13 characters matching the
father. Then, there are 7 characters that match only the mother, 8 characters
that match only the father, and 5 characters that match both. To create
a valid partitioning, we only need to assign 3 of the overlapping characters
to the mother and 2 to the father, which is clearly possible.
But wait! We are guaranteed by the constraints that at least half the
characters in the child's sample match the mother, so we don't even need
to check that. We merely need to check that no characters match neither
the mother nor the potential father, and that at least half the characters
match the father, as in the following helper method:
boolean possibleFather(String child, String mother, String man) {
int match = 0;
for (int i = 0; i < child.length(); i++) {
if (child.charAt(i) == man.charAt(i)) match++;
else if (child.charAt(i) != mother.charAt(i)) return false;
}
return match >= child.length()/2;
}
The rest of the solution is simply to check each man, and keep the indices
of those who cannot be ruled out.
QuipuReader
Used as: Division One - Level Two:
Value
|
450
|
Submission Rate
|
98 / 136 (72.06%)
|
Success Rate
|
67 / 98 (68.37%)
|
High Score
|
ChristopherH for 395.86 points (10 mins 48 secs)
|
Average Score
|
271.22 (for 67 correct submissions)
|
This problem is similar to Quipu, the Division II Easy, but extends it
in two ways. First, there are now several cords instead of just one.
Second, there can now be an arbitrary number of '-'s between
clusters of knots. The tricky part of the problem is deciding where one digit
ends and the next digit begins. This is easy enough in examples like
"-XXX--XX---XXXX-"
"-XX--XXXX---XX--"
where one or more columns of dashes separate adjacent digits, but much
harder in examples like
"XXX-XX-XXXX"
"XX-XXXX-XX-"
"X--XXX---X-"
where adjacent digits are not separated by columns of dashes. A column begins
a new digit if it satisfies two conditions. First, it must contain at
least one '
X' that falls to the immediate right of a '
-'.
Second, it must not contain any '
X's that fall to the immediate right
of another '
X'. These conditions can be tested for column
i > 0 as follows:
boolean newDigit = false;
for (int j = 0; j < knots.length; j++) {
if (knots[j].charAt(i) == 'X') {
newDigit = (knots[j].charAt(i-1) == '-');
if (!newDigit) break;
}
}
The rest of the code walks over all the columns, multiplying the numbers
for all the cords by 10 when a new digit is begun, and incrementing the
appropriate cord's number for each '
X' that is found.
RedBlack
Used as: Division One - Level Three:
Value
|
1000
|
Submission Rate
|
20 / 136 (14.71%)
|
Success Rate
|
16 / 20 (80.00%)
|
High Score
|
SnapDragon for 621.54 points (25 mins 44 secs)
|
Average Score
|
465.91 (for 16 correct submissions)
|
Red-black trees are described in most textbooks on data structures, but
they can be quite tricky to implement, often taking well over a hundred lines
of code. This problem concerns a variant
of red-black trees that is much simpler and shorter. Instead of the four
different kinds of rotations used by standard red-black trees, this variant
uses a single rebalancing transformation
called a twist. The problem is to simulate a sequence of inserts and
count how many twists occur.
There are four different configurations in which a twist can occur, but
all four are twisted into the same target shape. By writing a method to
produce this shape, we avoid replicating the rebalancing code in four different
places. The desired shape is
(RED) y
/ \
/ \
(BLK) x z (BLK)
/ \ / \
T1 T2 T3 T4
and the method that produces this shape is
Node twist(Node x,Node y,Node z, Node T1,Node T2,Node T3,Node T4) {
twistCount++;
x.color = BLK; x.L = T1; x.R = T2;
z.color = BLK; z.L = T3; z.R = T4;
y.color = RED; y.L = x; y.R = z;
return y;
}
where I have abbreviated left and right as
L and
R.
The insertion code can be written as a recursive method. Except for
the rebalancing code, it looks just like any other insertion into
a binary search tree.
Node ins(int x,Node t) {
if (t == EMPTY) return new Node(RED,x,EMPTY,EMPTY);
if (x < t.value) t.L = ins(x,t.L);
else t.R = ins(x,t.R);
...rebalance by calling twist if necessary...
return t;
}
The tricky part, of course, is the rebalancing code. We need to call
twist whenever
t has a red child that has a red child
(
t itself will always be black in this case.) The four configurations
of red child and red grandchild can be coded very compactly as
// check if child and grandchild are red
if (t.L.color == RED && t.L.L.color == RED)
return twist(t.L.L,t.L,t, t.L.L.L,t.L.L.R,t.L.R,t.R);
if (t.L.color == RED && t.L.R.color == RED)
return twist(t.L,t.L.R,t, t.L.L,t.L.R.L,t.L.R.R,t.R);
if (t.R.color == RED && t.R.L.color == RED)
return twist(t,t.R.L,t.R, t.L,t.R.L.L,t.R.L.R,t.R.R);
if (t.R.color == RED && t.R.R.color == RED)
return twist(t,t.R,t.R.R, t.L,t.R.L,t.R.R.L,t.R.R.R);
Although these masses of
.L's and
.R's can
be hard to read, they correspond directly to the
x,y,z
and
T1,T2,T3,T4 from the diagrams given in the problem:
(BLK) z (BLK) z x (BLK) x (BLK)
/ \ / \ / \ / \
(RED) y T4 (RED) x T4 T1 z (RED) T1 y (RED)
/ \ / \ / \ / \
(RED) x T3 T1 y (RED) (RED) y T4 T2 z (RED)
/ \ / \ / \ / \
T1 T2 T2 T3 T2 T3 T3 T4
Another trick that helps to make this code compact is to use a sentinel
node to represent the bottom of the tree, instead of null.
The sentinel node, called EMPTY, is simply a black node with
arbitrary values in its other fields. Notice how the ins method
checks for t == EMPTY instead of t == null. Without the
sentinel node, tests like
if (t.L.color == RED && t.L.L.color == RED)
return ...;
would have to be written as
if (t.L != null && t.L.color == RED &&
t.L.L != null && t.L.L.color == RED)
return ...;
The last detail to remember about insertions is that the root is colored black
at the end of every insertion. This can be accomplished by a wrapper method
that merely calls the main insertion method (ins) and colors the
result black.
Node insert(int x,Node root) {
root = ins(x,root);
root.color = BLK;
return root;
}
See my solution in the practice room for the complete code.
Many people complicated their solutions by trying to maintain parent
pointers. Parent pointers are necessary only if you try to code insertion
using loops instead of recursion, but such an approach tends to be messy.
This variant of red-black trees is a handy data structure to add to
your toolbox. Although it may have seemed difficult in a timed
setting, it is one of the simplest forms of balanced binary search
trees you'll ever find.
By
vorthys
TopCoder Member