Pairwise list reduction the hard way

Have you ever needed to process a list in pairs using Perl? The typical way to do this involves a loop with each() (for a hash) or splice() (for an array). This is my account of trying to find a more functional way to do it.

I’ll use a simple example of what I mean. Given a list of field and value pairs, the goal is to construct a query string suitable for an HTTP GET or POST request. Leaving aside that there are CPAN modules that do this for us and ignoring URL encoding, here is one way this might look:

# Example 1
use 5.010; use strict; use warnings;

my @data = (
  action => 'submit',
  type => 'comment',
  message => 'Hello'
);

my @output;
while ( my ($k,$v) = splice(@data,0,2) ) {
    push @output, "$k=$v"
}
say join( "&", @output );

That produces the output: action=submit&type=comment&message=Hello

Looking at the code, note the extra lexical @output to store the intermediate results. Also, note that the splice() is destructive and empties @data. If I want to preserve @data, I have to copy it first:

# Example 2
use 5.010; use strict; use warnings;

my @data = (
  action => 'submit',
  type => 'comment',
  message => 'Hello'
);

my @output;
my @copy = @data;
while ( my ($k,$v) = splice(@copy,0,2) ) {
    push @output, "$k=$v"
}
say join( "&", @output );

If I want to do the same thing with a hash of data, it would look like this:

# Example 3
use 5.010; use strict; use warnings;

my %data = (
  action => 'submit',
  type => 'comment',
  message => 'Hello'
);

my @output;
while ( my ($k,$v) = each %data ) {
    push @output, "$k=$v"
}
say join( "&", @output );

Is there a more functional way to do this that is non-destructive, has no intermediate variables, takes either a hash or array as just a list of input and can be used as part of a map chain? Sure! Here’s a way to do it with reduce from List::Util:

# Example 4
use 5.010; use strict; use warnings;
use List::Util 'reduce';

my @data = (
  action => 'submit',
  type => 'comment',
  message => 'Hello'
);

say join "&", map { "$_->[0]=$_->[1]" } map { @$_ } reduce { 
    ! @$a || ref $a->[-1] ? push @$a, $b : push @$a, [pop @$a, $b]; $a
} [], @data;

That doesn’t win any clarity awards, but it works. The reduce block is first called with the first two values of the list as $a and $b. Subsequent calls have the output of the last execution of the block in $a and the next value of the list in $b.

By passing in an empty array ref as the first element, I give it a place to accumulate pairs of data. On each call, I either stash the field name at the end of $a or pop off the field name and replace it with a new arrayref with the field name and value. I end the reduce call with $a so that becomes $a for the next iteration.

It builds up like this:

N  $a                                 $b
1  []                                 'action'      
2  ['action']                         'submit'
3  [['action', 'submit']]             'type'    
4  [['action', 'submit'], 'type']     'comment'
...

The output of reduce is a single arrayref of arrayrefs of pairs of data. The map { @$_ } call dereferences that back into a list of pairs (now wrapped in arrayrefs) that are turned into “field=value” strings and passed to the join.

Unfortunately, this approach creates an array ref for every pair and then throws them away. That’s not a great idea, so I thought I’d see if I could create the “field=value” pairs directly as strings during the reduce. By adding another layer to the container that holds the output, I can do this:

# Example 5
use 5.010; use strict; use warnings;
use List::Util 'reduce';

my @data = (
  action => 'submit',
  type => 'comment',
  message => 'Hello'
);

say join "&", map { @{$_->[0]} } reduce {
    ref $a->[-1] ? push @$a, $b : push @{$a->[0]}, pop(@$a) . "=$b"; $a
} [[]], @data;

That is almost exactly the same as the last one, except that the container starts with an extra arrayref to store the list of strings being generated and the dereferencing map is a little uglier.

How do these different examples benchmark? Ignoring example 1, which is destructive, here are the results of a quick benchmarking:

Rate example5 example4 example2 example3
example5 404781/s       --     -30%     -57%     -59%
example4 580886/s      44%       --     -38%     -41%
example2 936228/s     131%      61%       --      -5%
example3 985939/s     144%      70%       5%       --

The two while loop approaches are much faster! This shouldn’t be surprising given all the assignments and array manipulations happening in the reduce approaches. And the while loops are much easier to read. It turns out the typical approach is still the best.

What I’d really like to see someday is a kvmap (“key-value map”) function that would iterate through pairs from a list, pass them as $a and $b to a block and return a list of the results. Then I could write this:

# Example 6
use 5.010; use strict; use warnings;
use Doesnt::Exist::Yet 'kvmap';

my @data = (
  action => 'submit',
  type => 'comment',
  message => 'Hello'
);

say join "&", kvmap { "$a=$b" } @data;

That’s much shorter and simpler than a while loop and, if implemented in XS, might be comparably fast (or faster).

Does any enterprising soul want to try adding that to List::MoreUtils? I think it would find a lot of use.

(c.f. List::Pairwise, but it’s not implemented in XS)