How to profile Perl script performance?
Question
How to profile Perl script performance?
Profiling Perl scripts helps identify performance bottlenecks by measuring where your code spends the most time and resources. Perl offers several profiling approaches, from built-in options to powerful CPAN modules.
Built-in Profiling with Devel::DProf
Perl's core includes Devel::DProf for basic profiling. Run your script with the -d:DProf flag to generate a tmon.out file, then analyze it with dprofpp. However, this module is deprecated in modern Perl versions (5.14+) and has been removed from core in Perl 5.24.
Manual Timing with Benchmark Module
For targeted performance testing, use the core Benchmark module to compare different code approaches. This is ideal when you want to measure specific subroutines or algorithms rather than profile an entire application:
#!/usr/bin/env perl
use strict;
use warnings;
use Benchmark qw(timethese cmpthese);
# Compare different approaches to summing numbers
my @numbers = 1..10000;
print "Profiling different summation methods:\n\n";
my $results = timethese(1000, {
'foreach_loop' => sub {
my $sum = 0;
foreach my $n (@numbers) { $sum += $n; }
},
'for_loop' => sub {
my $sum = 0;
for (my $i = 0; $i < @numbers; $i++) { $sum += $numbers[$i]; }
},
'array_reduce' => sub {
my $sum = 0;
$sum += $_ for @numbers;
},
});
print "\nComparison:\n";
cmpthese($results);
print "\n--- Time::HiRes for precise measurements ---\n";
use Time::HiRes qw(time);
my $start = time();
my $result = 0;
$result += $_ for 1..100000;
my $elapsed = time() - $start;
printf "Processed 100,000 numbers in %.4f seconds\n", $elapsed;
printf "Rate: %.0f operations/second\n", 100000/$elapsed if $elapsed > 0;
Best Practice: Devel::NYTProf
While not in core, Devel::NYTProf is the gold standard for Perl profiling. It provides line-by-line analysis, subroutine profiling, and HTML reports. Install via cpanm Devel::NYTProf, then run: perl -d:NYTProf script.pl.
Common Pitfalls
- Profiling changes program behavior—always profile with realistic data
- Focus on hot spots (frequently called code) rather than optimizing everything
- I/O operations often dominate CPU time; profile separately
Benchmarkmodule disables garbage collection during tests, which may skew results for memory-intensive code
Version Notes
The Benchmark and Time::HiRes modules are core since Perl 5.7.3. For Perl 5.24+, use Devel::NYTProf instead of the removed Devel::DProf.
Verified Code
Executed in a sandbox to capture real output. • v5.34.1 • 1919ms
Profiling different summation methods:
Benchmark: timing 1000 iterations of array_reduce, for_loop, foreach_loop...
array_reduce: 0 wallclock secs ( 0.19 usr + 0.06 sys = 0.25 CPU) @ 4000.00/s (n=1000)
(warning: too few iterations for a reliable count)
for_loop: 1 wallclock secs ( 0.53 usr + 0.16 sys = 0.69 CPU) @ 1449.28/s (n=1000)
foreach_loop: 0 wallclock secs ( 0.24 usr + 0.06 sys = 0.30 CPU) @ 3333.33/s (n=1000)
(warning: too few iterations for a reliable count)
Comparison:
Rate for_loop foreach_loop array_reduce
for_loop 1449/s -- -57% -64%
foreach_loop 3333/s 130% -- -17%
array_reduce 4000/s 176% 20% --
--- Time::HiRes for precise measurements ---
Processed 100,000 numbers in 0.0031 seconds
Rate: 32521548 operations/second
(empty)