use Benchmark qw(timethese cmpthese timeit countit timestr); # You can always pass in code as strings: timethese $count, { 'Name1' => '...code1...', 'Name2' => '...code2...', }; # Or as subroutines references: timethese $count, { 'Name1' => sub { ...code1... }, 'Name2' => sub { ...code2... }, }; cmpthese $count, { 'Name1' => '...code1...', 'Name2' => '...code2...', }; $t = timeit $count, '...code...'; print "$count loops of code took:", timestr($t), "\n"; $t = countit $time, '...code...'; $count = $t->iters; print "$count loops of code took:", timestr($t), "\n";
The Benchmark module can help you determine which of several possible choices executes the fastest. The timethese function runs the specified code segments the number of times requested and reports back how long each segment took. You can get a nicely sorted comparison chart if you call cmpthese the same way.
Code segments may be given as function references instead of strings (in fact, they must be if you use lexical variables from the calling scope), but call overhead can influence the timings. If you don't ask for enough iterations to get a good timing, the function emits a warning.
Lower-level interfaces are available that run just one piece of code either for some number of iterations (timeit) or for some number of seconds (countit). These functions return Benchmark objects (see the online documentation for a description). With countit, you know it will run in enough time to avoid warnings, because you specified a minimum run time.
To get the most out of the Benchmark module, you'll need a good bit of practice. It isn't usually enough to run a couple different algorithms on the same data set, because the timings only reflect how well those algorithms did on that particular data set. To get a better feel for the general case, you'll need to run several sets of benchmarks, varying the data sets used.
For example, suppose you wanted to know the best way to get a copy of a string without the last two characters. You think of four ways to do so (there are, of course, several others): chop twice, copy and substitute, or use substr on either the left- or righthand side of an assignment. You test these algorithms on strings of length 2, 200, and 20_000:
which produces the following output:use Benchmark qw/countit cmpthese/; sub run($) { countit(5, @_) } for $size (2, 200, 20_000) { $s = "." x $len; print "\nDATASIZE = $size\n"; cmpthese { chop2 => run q{ $t = $s; chop $t; chop $t; }, subs => run q{ ($t = $s) =~ s/..\Z//s; }, lsubstr => run q{ $t = $s; substr($t, -2) = ''; }, rsubstr => run q{ $t = substr($s, 0, length($s)-2); }, }; }
With small data sets, the "rsubstr" algorithm runs 14% faster than the "chop2" algorithm, but in large data sets, it runs 45% slower. On empty data sets (not shown here), the substitution mechanism is the fastest. So there is often no best solution for all possible cases, and even these timings don't tell the whole story, since you're still at the mercy of your operating system and the C library Perl was built with. What's good for you may be bad for someone else. It takes a while to develop decent benchmarking skills. In the meantime, it helps to be a good liar.DATASIZE = 2 Rate subs lsubstr chop2 rsubstr subs 181399/s -- -15% -46% -53% lsubstr 214655/s 18% -- -37% -44% chop2 338477/s 87% 58% -- -12% rsubstr 384487/s 112% 79% 14% -- DATASIZE = 200 Rate subs lsubstr rsubstr chop2 subs 200967/s -- -18% -24% -34% lsubstr 246468/s 23% -- -7% -19% rsubstr 264428/s 32% 7% -- -13% chop2 304818/s 52% 24% 15% -- DATASIZE = 20000 Rate rsubstr subs lsubstr chop2 rsubstr 5271/s -- -42% -43% -45% subs 9087/s 72% -- -2% -6% lsubstr 9260/s 76% 2% -- -4% chop2 9660/s 83% 6% 4% --
Copyright © 2002 O'Reilly & Associates. All rights reserved.