Devel::NYTProf saved us again and raises a question

Not for the first time a Perl POE daemon in our development system seemed to be taking a long time on certain requests. It is a large amount of code and although we had our suspicions we've learnt the lesson now - Devel::NYTProf is so simple to use we had no hesitation in running it on this daemon. Instead of:
perl program_args program.pl
you simply run:
perl -d:NYTProf program_args program.pl
then a nytprof.out file is created and then you can generate the HTML report and view it with:
mkdir report
nytprofhtml -o report
firefox 'file://report/index.html'
Unsuprisingly, our suspicions were confirmed. The daemon issues an HTTP POST request retrieving a large amount of JSON encoded data. In this case the first element in the JSON needs to be extracted and compared with an argument passed in the client request to ensure the client can have the JSON data (your probably saying why not do it further down the line in the code handling the HTTP POST but just take my word we cannot right now) and then that element needs to be removed. The JSON is decoded into a Perl arrayref, the first element is examined, checked and finally deleted from the array before encoding it again and passing it back to the client. As the JSON data was large the decoding into a Perl array and back into JSON was consuming a lot of time. The JSON encoding is fairly simple, especially for an array of strings and numbers and looks like this:
["string", number, "string2"...we had a lot here]
Since JSON encoding is straight forward and we know the first element is a string and it is that one we want to extract it was far more simple to do:
if ($json =~ s/^\["(.*?)"\s,\s(.*)$/[$2/) {
  # $1 is extracted string
  # $2 is new JSON without the first string element
}
The time dropped by over 90%. Surprisingly (for us) although the above was a wasteful time consumer it was not the main one. Most of the time is lost in HTTP::Message::add_content:
149                                     sub add_content
150                                    
# spent 320s (320+149ms) within HTTP::Message::add_content which was called 7585 times, avg 42.3ms/call: # 7585 times (320s+149ms) by LWP::Protocol::__ANON__[/usr/local/share/perl/5.10.0/LWP/Protocol.pm:139] at line 137 of LWP/Protocol.pm, avg 42.3ms/call
{
151     7585    12.0ms                  my $self = shift;
152     7585    22.4ms                  $self->_content unless exists $self->{_content};
153     7585    14.5ms                  my $chunkref = \$_[0];
154     7585    14.7ms                  $chunkref = $$chunkref if ref($$chunkref); # legacy
155                                    
156     7585    66.2ms  7585    149ms   _utf8_downgrade($$chunkref);
# spent 149ms making 7585 calls to HTTP::Message::__ANON__[HTTP/Message.pm:18], avg 20µs/call
157                                    
158     7585    320s                    my $ref = ref($self->{_content});
159     7585    53.9ms                  if (!$ref) {
160     7585    365ms                   $self->{_content} .= $$chunkref;
161                                     }
162                                     elsif ($ref eq "SCALAR") {
163                                     ${$self->{_content}} .= $$chunkref;
164                                     }
165                                     else {
166                                     Carp::croak("Can't append to $ref content");
167                                     }
168     7585    242ms                   delete $self->{_parts};
169                                     }
170
I can't for the life of me see why a simple ref call on a hash would take 320s even for 7585 calls. We are not using LWP::UserAgent::POE just plain LWP::UserAgent - in fact the former is not even installed. I've not got to the bottom of this yet. Perhaps you can see something I've missed? I've posted this to the Devel::NYTProf list as well if you are interested in following it.

Comments

where the blame currently lies

After a good deal of work trying to narrow this down it appears it is something to do with taint mode and I'll blog about it separately.