develooper Front page | perl.perl5.porters | Postings from August 2013

Re: [perl #119445] performance bug: perl Thread::Queue is 20x slower than Unix pipe

Thread Previous | Thread Next
John Heidemann
August 28, 2013 04:35
Re: [perl #119445] performance bug: perl Thread::Queue is 20x slower than Unix pipe
Message ID:
On Tue, 27 Aug 2013 11:18:57 +0100, Nicholas Clark wrote: 
>On Mon, Aug 26, 2013 at 08:58:14AM -0700, John Heidemann wrote:
>> My concern is that Thread::Queue also *forces* shared data, even when
>> it's not rqeuired.  If that sharing comes with a 20x performance hit,
>> that should be clear.
>Yes, I agree that that's a valid concern, and we could document that better.
>As someone rather too close to the code, it's not easy to pull back far
>enough to work out where someone reading the documentation for the first
>time would have expected to have found such a warning.
>Do you have a suggestion for where we should document this, such that you
>would have read it had it been there? (Even better if you can suggest a
>suitable change)

A proposed patch to perlthrtut is attached at the end of this message.

>> Alternatively, I'd love some mechanism to share data between threads
>> that allows a one-time handoff (not repeated sharing) with pipe-like
>> performance.  One would *think* that shared memory should be able to be
>> faster than round-tripping through a pipe (with perl parsing and kernel
>> IO).  It seems like a shame that perl is forcing full-on sharing since
>> it's slow and not required (in this case).
>Agree, I'd love this too. It would permit a lot of effective higher level
>concurrency designs to work*. But sadly I don't believe that Perl 5 will
>ever be able to provide a performant hand-off mechanism. The internals
>assume all over that it's safe for any logical read to actually be a write
>behind the scenes (making it awkward to provide any sort of read-only view
>of another thread's data), and all interpreter data structures are
>implicitly tied to the interpreter that allocated them, which would take a
>massive amount of refactoring to attempt to untangle.
>I don't think that this is particularly a Perl problem. I'm not aware of any
>comparable C-based dynamic language has managed to retrofit true
>concurrency. CPython still has a GIL (and Unladen Swallow failed to deliver
>on its design to remove that), and my understanding is that Ruby (MRI/YARV)
>still single-threads its interpreter, and PHP doesn't even offer threading.
>If we had a design to steal, we'd steal it. :-/

I don't know anything about C-level internals of perl.

I agree these are inherrent in *shared* variables independent of language.

It's too bad there's no way to move data between two threads without
making the data shared (other than the move).  A one-time copy from
thread A to B.  C-only programs have done this for ages (see for
example, "The Duality of Memory and Communication in the
Implementation of a Multiprocessor Operating
System" by Young et al, ACM SOSP 1987).

What I'll do for now is to get this effect by printing it to pipe and
reading it back in through the other end, but boy what a lot of work on
the perl-side that could be hidden inside the C, both cleaner and
hopefully faster.


--- perlthrtut.pod-	2013-08-27 08:47:16.347167972 -0700
+++ perlthrtut.pod	2013-08-27 08:53:26.159772710 -0700
@@ -465,6 +465,13 @@
 data inconsistency and race conditions. Note that Perl will protect its
 internals from your race conditions, but it won't protect you from you.
+=head2 Thread Pitfalls: Performance
+Shared data is and locking expensive, slowing down access.
+As of perl 5.18, one should expect sharing data between threads
+with tools such as L<Thread::Queue> to be about 15-20x slower
+than copying the data through L<pipe(2)>.
 =head1 Synchronization and control
 Perl provides a number of mechanisms to coordinate the interactions

Thread Previous | Thread Next Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at | Group listing | About