develooper Front page | perl.perl5.porters | Postings from June 2019

[perl #134172] commit 027471cf breaks t/op/sprintf2.t on FreeBSD-11

Thread Previous | Thread Next
From:
Hugo van der Sanden via RT
Date:
June 10, 2019 11:53
Subject:
[perl #134172] commit 027471cf breaks t/op/sprintf2.t on FreeBSD-11
Message ID:
rt-4.0.24-13913-1560167617-1481.134172-15-0@perl.org
On Thu, 06 Jun 2019 10:30:58 -0700, jkeenan wrote:
> Watchpoint 2: PL_locale_mutex
> 
> Old value = 0x801e1c1e0
> New value = 0x2
> _pthread_mutex_destroy (mutex=0xa45b28) at pthread_md.h:95
> 95              return (TCB_GET64(tcb_thread));

Ok, I can make sense of this bit now: the pthread_mutex_destroy code explicitly overwrites the pointer to signal that it has been destroyed:

/usr/src/lib/libthr/thread/thr_private.h has:
#define        THR_MUTEX_DESTROYED    ((struct pthread_mutex *)2)
/usr/src/lib/libthr/thread/thr_mutex.c has:
  *mutex = THR_MUTEX_DESTROYED;
in multiple cases.

So this isn't corruption but intentional. The core of the problem, though, is that we're seeing an error returned from the mutex_destroy, which is what is triggering our attempt to croak. And the error turns out to be EBUSY, indicating that we're trying to destroy a mutex that is still locked, and with the debug option -DLv it becomes trivial to see that:

$ ./perl -Ilib -DLv -we 'my $x = sprintf("%2E", 0)'
sv.c: 12972: locking lc_numeric; depth=1
sv.c: 13422: unlocking lc_numeric; depth=0

EXECUTING...

$ ./perl -Ilib -DLv -we 'my $x = sprintf("%7000000000E", 0)' 
sv.c: 12972: locking lc_numeric; depth=1

EXECUTING...

sv.c: 12972: avoided lc_numeric_lock; new depth=2
Numeric format result too large at -e line 1.
Segmentation fault (core dumped)
$ 

We're invoking STORE_LC_NUMERIC_SET_TO_NEEDED, which is documented with:
  On threaded perls not operating with thread-safe functionality, this macro uses
  a mutex to force a critical section.  Therefore the matching RESTORE should be
  close by, and guaranteed to be called.
.. however this is wrapping precisely the call with the new error check, so when that croaks the RESTORE_LC_NUMERIC is missed, and that causes us to attempt global destruction with the mutex still held.

Because there are 2 nested calls to the STORE macro, I'm not sure how correct the obvious fix is; there also seems to be some cleverness with pragmas to try and enforce balancing of STORE/RESTORE. I was able to get it both to compile and run the test case without error with the attached, but I think it's Karl's baby so I hope he can advise.

Note also: a) there are other opportunities to hit fatal errors in sprintf, I assume they'll all need the same handling; b) the 'mutex_destroy or croak' pattern makes no sense after PerlIO_teardown(), but I'm not sure what we can do instead - the bug uncovered here certainly suggests we'd have been unwise simply to ignore errors on mutex_destroy.

Hugo

---
via perlbug:  queue: perl5 status: open
https://rt.perl.org/Ticket/Display.html?id=134172

Thread Previous | Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About