Perl super-easy parallelization with threadeach

Posted by Chris on December 7th, 2008 filed in programming

I’ve been thinking about a good way to make perl more parallelize-able. The thing that keeps coming to my mind is that it should be so easy that you wouldn’t even think about it. Lots of the time in sysadmin-land, you have to just do a ton of things completely identically to a bunch of things. Examples from just the last week at work.

For each thing in a list, connect to its database, get x data, do some analysis on that.

For each server in a list, connect to it and so something. Push a file, get a file, run a command, etc.

For each ip/port in a list, open a socket and listen for x time, then return the results.

So, what to use? I just like the name threadeach(). Normally, in perl, you do this

foreach  my $thing(@list_of_things){... do something }

It’d be nice if you knew this was easily done in parallel, to do it like this:

threadeach my $thing(@list_of_things){function to be performed on each thing}

Right now, you can *sort of* do the same thing with a little work. I’ve got a threadeach module like this I’ve been using.


I whipped up a module for it with three functions in it.

  • threadeach(\&subroutine,@array) #will parallelize, running  (number of cpu cores) threads at a time
  • threadall(\&subroutine,@array) # will parallelize all at once!!!  Kind of crazy but fun actually
  • threadsome(\&subroutine,<num to run>,@array); #will parallelize the number passed of threads at a time

It’s also got another trick in that it waits for them all to be done and then returns the “return” values in order.  A lot of the time, I do a foreach and print something, in this case I can just return what I’d have printed before and print it at the end.  print threadeach(\&sub,@things);

I’ve been using it in my check_network script that looks at things in a given subnet and it works pretty well.  I just had to change the line from foreach my $ip(0..255){…} to threadeach(\&…,0..255);sub {…}  And instead of printing inside, I return the printed value (as stated above).  It’s been working really well in this limited case, I have to try it more on other things, but don’t see why it wouldn’t work fine.  But since this script does an nmap against the host, it uses a good bit of CPU – I tried using threadall() and it almost hung the machine – 255 nmap processes at once will do that.


Running original version: sudo time perl -> 524.71 real 76.91 user 26.39 sys

Running threadeach version: sudo time perl -> 189.95 real 77.31 user 28.48 sys

Roughly 1/3 the time.  Which makes sense, because no matter how long one of the machines takes to do, it’s added in to the rest for the original version, and can be “worked around” in the parallel version.  For my example, for some reason the .107 box takes several minutes ( I skipped it in this test) to run – but even not counting that one, there are some that are almost instant(the down boxes) and some that take longer.

How it works

Not counting the deciding how many to run at a time (which depends on how it’s called) – that just tries to get the number of cpus on the machine, and if it can’t, returns an arbitrary number (currently 8), it’s fairly straightforward.  Set up an empty hash to keep track of thread ID vs index, an index variable to keep track of where we are at with the list, and finally an empty array to store anything being returned.

Main loop:  As long  as there are:

  • Threads working
  • Threads done and waiting to return
  • or more things to do


  • Get the return values of any threads waiting to give them back. (puts the return value of the thread into the return array corresponding to the slot that it was passed originally)
  • launches more threads, until there are either no more left or until it has reached the max number
  • When launching those threads, it puts the value of the index (id corresponding to the slot of original array) into the value side of a hash where the key is the thread ID
  • sleep for one second.

And at the end, returns the @return array.  The @return isn’t strictly necessary if it’s supposed to be a foreach replacement, but works really well for where it’s useful.  The sleep(1); isn’t strictly necessary either, but 1- if you’re doing a bunch of threads, waiting a second at a time isn’t a huge deal, and 2- otherwise it pegs the CPU just doing a tight while loop checking on thread status.

In the future…

Figure out how to make it work as a drop-in replacement for foreach. Calling it as a function seems so hack-ish.  A better way to decide how many threads to call (if Sys::CPU doesn’t work).  Was thinking about the year – 2005, so the number would increase.  Or could just require Sys::CPU…  Could also make sure whatever is in the main block is thread-safe, but should probably just trust the user.  In the meantime, I’m going to use it for a little bit and bang against it on a few systems before throwing it out into the cruel, cold world.

It may also be cool if it can buffer I/O so that for some things it really does act just like foreach, too.

Leave a Comment