In the last few days on and off I've been trying to locate a problem with a fairly complicated Perl daemon using POE which was taking a huge amount of time to return a large file via LWP::UserAgent. It's taken a while since in reality we were not doing a simple HTTP GET and the circumstances under which the problem occurs are difficult to set up. However, I've at last tracked it down to taint mode.
We have a 30Mb file on an Apache web server on our local 1Gb network and I've reduced the code down to:
The time taken for this script is around 3 seconds normally but add taint mode (-t) and it goes up to over 5 minutes.
We were using Perl 5.10.0 but I tried 5.10.1 and 5.12.1 also.
I used Devel::NYTProf to examine what was happening (see Long time reported for ref($x->{key})) and this is what I got:
313 seconds doing my $ref = ref($self->{_content});! It seems much more likely the 313s was spent in $self->{_content} .= $$chunkref because as far as I can see this builds up the HTTP response 4K at a time concatenating onto an ever growing string. Tim Bunce took a look at my nytprof.out file and said the clicks spent in the offending line started low and kept on increasing to ridiculous levels so perhaps something is going on in taint mode which occurs on the result of the concatenation. As the string grows whatever taint mode causes has to work on an ever increasing string and hence takes longer each iteration.
For now I've had to disable taint mode as we cannot cope with this. I'd appreciate any ideas.
Comments
An even more straight forward example
In Perl 5.12.1 the following code takes less than 1s to run without taint mode and in taint mode 8 minutes!
use warnings;
use Scalar::Util qw(tainted);
my $fd;
open($fd, ">", "file.dat");
print $fd 'x' x 4096;
close $fd;
my $data;
{
local $/;
open ($fd, "<", "file.dat");
$data = <$fd>;
close $fd;
}
print "data is tainted: ", tainted($data) ? 'yes' : 'no', "\n";
my %hash;
$hash{content} = '';
foreach (1..10000) {
my $ref = ref($hash{content});
#if (!$ref) {
$hash{content} .= $data;
#}
}
print length($hash{content}), "\n";
Use an array of strings instead of a simple string
use warnings;
use Scalar::Util qw(tainted);
my $fd;
open($fd, ">", "file.dat");
print $fd 'x' x 4096;
close $fd;
my $data;
{
local $/;
open ($fd, "<", "file.dat");
$data = <$fd>;
close $fd;
}
print "data is tainted: ", tainted($data) ? 'yes' : 'no', "\n";
my %hash;
$hash{content} = []; # change #1
foreach (1..10000) {
my $ref = ref($hash{content});
#if (!$ref) {
push @{$hash{content}}, $data; # change #2
#}
}
print length(join "", @{$hash{content}}), "\n"; # change #3
Interesting solution
reported to perlbug