The Perl You Need to Know: Benchmarking Perl
Originally appearing in the Web Developer's Virtual Library.
One of the most popular events of the summer Olympics is the men's 100 meter sprint, an event that has appeared at every Olympics since the original at Athens. The competitors are often fiery tempered, and the victors declared among the fastest men on Earth. The millions who watch the 100 meter are captured, ultimately, by some 10 seconds worth of action. Actually, 9.79 seconds, considering Maurcie Greene's current world record. But what would Mr. Greene say if we told him that a Perl subroutine can split and sort a sentence into an alphabetically ordered list of letters 98,253 times in that same 9.79 seconds? What do you have to say for yourself now, Mr. Fast Pants? Well, he'd most likely pummel our soft and fleshy behinds, and those of anyone who knows this much about Perl. But it's still true. And we know it's true because of the Benchmark module - our handy Perl stopwatch with which we can time, optimize, and slim down on code. I think Mr. Greene would appreciate that. After the pummeling.
Quality is Job Number OneIt's been said that there are a thousand ways to solve any single goal in Perl. Indeed, Perl is a very flexible language, liberating our quirky brains to find solutions peculiar to our individual personalities, although a thousand may be on the optimistic side. But let's not be fooled into complacency by the tolerance and open-minded nature of the cult of Perl -- not all solutions are created equal, especially when it comes to execution time. This installment of The Perl You Need To Know to learn covers the Benchmark module -- the handy Perl stopwatch with which we can time, optimize, and slim down on code.
Interpreted languages like Perl are often looked down upon by speed demons, who prefer the endorphine highs of compiled languages or the extremists spending their days inhaling assembly language and machine code. Despite their sneers, it's quite easy to conjure up several Perl subroutines all of which solve the same problem, and do so in wildly varying execution times. This may not matter for the script that is only run once, or on special occasions -- but with Perl backing many Web servers, it's entirely possible for Perl scripts to be executed millions of times a day. Like picking pennies from the floor, it all adds up over time.
Perl's Benchmark module is an extremely useful tool for measuring the speed of your scripts, whether in their entirety, or down to the level of particular subroutines, or single lines of code. While some may want to comb over a script to optimize the speed of every expression, benchmarking is also incredibly helpful in finding significant bottlenecks in a script - segments of code that eat up the majority of processing time. Often, focusing on optimizing one or two main bottlenecks can improve an entire script's execution time dramatically.
Out of the Starting BlockThe Benchmark module provides a number of tools for timing and comparing the execution time of code. Depending how you use these tools, you can time single statements, entire subroutines, or the entire script. Often, you'll want to time all of these depending on the script and the course of your investigation. For starters, let's just say that you want to time a simple statement. Perhaps you have some variations in mind, and are wondering which would be fastest.
Imagine a string, perhaps a filename, and you want to extract the filename extension - defined here as anything that occurs to the right of the decimal; e.g. the string may be "filename.txt" and we want the "txt" portion.
The first solution that comes to mind uses a regular expression pattern match, such as: