I’m traveling but cursory examining the patch looks good Sent from my iPhone > On Jun 10, 2019, at 7:24 AM, James E Keenan via RT <perlbug-followup@perl.org> wrote: > >> On Mon, 10 Jun 2019 11:53:37 GMT, hv wrote: >>> On Thu, 06 Jun 2019 10:30:58 -0700, jkeenan wrote: >>> Watchpoint 2: PL_locale_mutex >>> >>> Old value = 0x801e1c1e0 >>> New value = 0x2 >>> _pthread_mutex_destroy (mutex=0xa45b28) at pthread_md.h:95 >>> 95 return (TCB_GET64(tcb_thread)); >> >> Ok, I can make sense of this bit now: the pthread_mutex_destroy code >> explicitly overwrites the pointer to signal that it has been >> destroyed: >> >> /usr/src/lib/libthr/thread/thr_private.h has: >> #define THR_MUTEX_DESTROYED ((struct pthread_mutex *)2) >> /usr/src/lib/libthr/thread/thr_mutex.c has: >> *mutex = THR_MUTEX_DESTROYED; >> in multiple cases. >> >> So this isn't corruption but intentional. The core of the problem, >> though, is that we're seeing an error returned from the mutex_destroy, >> which is what is triggering our attempt to croak. And the error turns >> out to be EBUSY, indicating that we're trying to destroy a mutex that >> is still locked, and with the debug option -DLv it becomes trivial to >> see that: >> >> $ ./perl -Ilib -DLv -we 'my $x = sprintf("%2E", 0)' >> sv.c: 12972: locking lc_numeric; depth=1 >> sv.c: 13422: unlocking lc_numeric; depth=0 >> >> EXECUTING... >> >> $ ./perl -Ilib -DLv -we 'my $x = sprintf("%7000000000E", 0)' >> sv.c: 12972: locking lc_numeric; depth=1 >> >> EXECUTING... >> >> sv.c: 12972: avoided lc_numeric_lock; new depth=2 >> Numeric format result too large at -e line 1. >> Segmentation fault (core dumped) >> $ >> >> We're invoking STORE_LC_NUMERIC_SET_TO_NEEDED, which is documented >> with: >> On threaded perls not operating with thread-safe functionality, this >> macro uses >> a mutex to force a critical section. Therefore the matching RESTORE >> should be >> close by, and guaranteed to be called. >> .. however this is wrapping precisely the call with the new error >> check, so when that croaks the RESTORE_LC_NUMERIC is missed, and that >> causes us to attempt global destruction with the mutex still held. >> >> Because there are 2 nested calls to the STORE macro, I'm not sure how >> correct the obvious fix is; there also seems to be some cleverness >> with pragmas to try and enforce balancing of STORE/RESTORE. I was able >> to get it both to compile and run the test case without error with the >> attached, but I think it's Karl's baby so I hope he can advise. >> >> Note also: a) there are other opportunities to hit fatal errors in >> sprintf, I assume they'll all need the same handling; b) the >> 'mutex_destroy or croak' pattern makes no sense after >> PerlIO_teardown(), but I'm not sure what we can do instead - the bug >> uncovered here certainly suggests we'd have been unwise simply to >> ignore errors on mutex_destroy. >> >> Hugo > > Thanks. I realize your patch is preliminary, but to get some data from other platforms, I've placed it in this branch: > > smoke-me/jkeenan/hv/134172-hack > > Thank you very much. > > -- > James E Keenan (jkeenan@cpan.org) > > --- > via perlbug: queue: perl5 status: open > https://rt.perl.org/Ticket/Display.html?id=134172Thread Previous | Thread Next