Performance and Optimization in the "Real World"
Wednesday, March 2, 2005
By n0vice
TopCoder Member
Introduction
Most people involved with TopCoder are perfectionists, always looking for that fastest and most elegant solution. During Single Round Matches we strive to write tight code, since every additional line or loop brings us closer to the "eight seconds of doom". So, higher performance code is always better, right? Fortunately, wrong.
This article will talk about the basic performance strategy for software development. I hope it will curb some of the extreme enthusiasm for CPU optimization innate to all programmers (especially the less experienced ones). Software performance is a very broad topic. We will only consider general-purpose software (not scientific computing, data mining, etc).
Consider the trade-offs
Let's start with the following:
The Cardinal Rules of Optimization:
- Do not optimize
- For experts only: do not optimize yet
Why is optimizing so discouraged? Performance is not free. It usually costs a lot of time and results in less readable and maintainable code, even when done right. Optimizing too early often results in badly functioning implementations and poor long-term decisions. In many cases, modern compilers can optimize code output better than the developers who have spent so much of their time tweaking the code. Programmers have a tendency to focus on local techniques and code fragments to improve performance, when the largest impact is in design and algorithm choices. Programmers also tend to use their intuition and experience rather than data to recognize performance problems and make fixes. The list goes on.
Perhaps the strongest reason for postponing or omitting optimizations is expressed by this great quote:
"The biggest performance increase you'll ever see is when your system goes from not working to working."
- John Ousterhout
In other words, time is better spent making sure the system is correct, rather than that it works fast. Performance does not matter if software does not work.
Any project of sufficient size requires a performance strategy. "Set goals and measure against them" is a simple (yet surprisingly non-obvious) strategy, which allows developers to avoid most pitfalls of performance and optimization.
Set goals
The #1 most important step in performance analysis is goal setting. It is impossible to achieve good performance without defining, in great detail, what "good performance" means. We need to define metrics against which a given solution candidate can be compared. Otherwise, the effort that can be put into optimizing is virtually unbounded.
For example, let's look at TopCoder SRMs. While SRMs and their frantic pace encourage some practices that would likely be considered unacceptable for commercial software development, they do a great job setting performance goals. For TopCoder matches, there is a single performance metric - the solution must execute in eight seconds for any valid input. It is very important to understand, that as far as performance is concerned, a SRM solution that runs in 7.9s is
just as good as the one that runs in 0.1s. In fact, the better of the two is the one, which can be implemented in the shortest time.
Of course, correctness is paramount. A lightning fast piece of code that fails in a corner case is worthless compared to a behemoth that always works right. To reconcile functionality and performance demands we can use these performance goal-setting guidelines:
- Make the common case fast
- Make the uncommon case work
In other words, we take use case analysis into consideration when setting goals that make sense for a given application. We have to make sure that the implementation, which satisfies the goal, still works correctly in all scenarios.
Another great example, which illustrates the importance of goals is real time spell checking. Suppose we are making a text editor, which executes a spell-checker routine after every key press. Currently, the routine executes in no more than 50ms. Since we have determined that an average user cannot type faster than a character every 150ms, the 50ms routine satisfies the goal, and no further time needs to be spent improving it.
Measure
Once the goals are set, we must measure against them. In general, no one should be allowed to make performance claims without having supporting numbers for the defined metrics. For example, which code is better (assume the compiler does not optimize):
Code A:
int pow8(int x) {
return x*x*x*x*x*x*x*x;
}
Code B:
int pow8(int x) {
int a = x*x;
int b = a*a;
return b*b;
}
The answer depends on whether this piece of code has any effect on our performance metrics. The most direct way to find out is to measure with both versions. The only justification for optimizing is concrete evidence of improvement relative to preset goals.
To find CPU bottlenecks proactively, one can use profiling. Running benchmarks under a profiler will show the breakdown of time taken by routine. If it turns out that pow8() generally takes 1% of all time, then no matter how much we optimize it, overall performance will not improve much.
However, there is a big caveat: CPU is rarely the bottleneck. Nowadays, processors are so fast and compilers are so good that most issues will come from input/output (I/O). I/O problems encompass everything from things directly related to the application function to invoked components (such as databases) to the host operating system. Finding I/O bottlenecks is significantly more difficult and no "one size fits all" approach exists. In some cases when CPU synchronizes with I/O, I/O time becomes CPU time, and thus profiling can be helpful in identifying problems. In general, benchmarks used for measurement must collect as much auxiliary data as possible to aid the investigation effort. For example, Microsoft Windows provides a multitude of performance counters (such as CPU utilization and page faults/second), which can be plotted alongside application-defined counters.
Conclusion
In conclusion, the only way to succeed with performance is to set goals and measure against them. In the presence of time pressure, it is important to focus on bottlenecks that most affect the metrics defined for the project at hand. While we always want to create the fastest products possible, a disciplined strategy is necessary to properly weigh trade-offs and efficiently direct programming effort.
Would you like to write a feature?