{"id":24,"date":"2008-12-07T21:04:08","date_gmt":"2008-12-08T04:04:08","guid":{"rendered":"http:\/\/www.imaginarybillboards.com\/?p=24"},"modified":"2010-04-05T10:45:13","modified_gmt":"2010-04-05T17:45:13","slug":"perl-super-easy-parallelization-with-threadeach","status":"publish","type":"post","link":"http:\/\/www.imaginarybillboards.com\/?p=24","title":{"rendered":"Perl super-easy parallelization with threadeach"},"content":{"rendered":"

I’ve been thinking about a good way to make perl more parallelize-able. The thing that keeps coming to my mind is that it should be so easy that you wouldn’t even think about it. Lots of the time in sysadmin-land, you have to just do a ton of things completely identically to a bunch of things. Examples from just the last week at work.<\/p>\n

For each thing in a list, connect to its database, get x data, do some analysis on that.<\/p>\n

For each server in a list, connect to it and so something. Push a file, get a file, run a command, etc.<\/p>\n

For each ip\/port in a list, open a socket and listen for x time, then return the results.<\/p>\n

So, what to use? I just like the name threadeach(). Normally, in perl, you do this<\/p>\n

foreach  my $thing(@list_of_things){... do something }<\/pre>\n

It’d be nice if you knew this was easily done in parallel, to do it like this:<\/p>\n

threadeach my $thing(@list_of_things){function to be performed on each thing}<\/pre>\n

Right now, you can *sort of* do the same thing with a little work. I’ve got a threadeach module like this I’ve been using.<\/p>\n

Threadeach<\/h3>\n

I whipped up a module for it with three functions in it.<\/p>\n