On Tue, Aug 03, 2021 at 04:54:56PM +0100, hv@crypt.org wrote: > :I think we can patch them all. Several are just truth tests - I propose > :a new macro HvIS_EMPTY(hv) to replace them, which we also add to ppport.h > > I agree with leonerd that we should do this one right away. > > :Others just iterate over all the hash buckets - I propose we add > :hv_foreach() to core and ppport.h to handle these cleanly. > > Does that cover all remaining uses of HvARRAY that you found? Ie could > we provide HvIS_EMPTY and hv_foreach and then deprecate HvARRAY? Sigh, tied hashes... So, the initial "implementation" of HvIS_EMPTY() was !HvTOTALKEYS(), which made sense, as it's answering the same "logical" question as !HvARRAY() - is this hash empty? Except that it answers it correctly. But then I realised that "half" the fun with the XS interface to hashes is that almost all of it doesn't hide the implementation details of tied hashes. The only real exception is the existing iteration interface, and also my implementation of hv_foreach(). So, it feels like a design regression to add a new interface which is both "you're on your own with tied hashes" and inconsistent with foreach. Hence, the better implementation is this: PERL_STATIC_INLINE bool Perl_HvIS_EMPTY(const HV *hv) { return !SvRMAGICAL(hv) && !HvTOTALKEYS(hv); } at which point I tested that in the perl core (in place of the previous), and discovered that 1) Devel::Peek::Dump would SEGV on a tied hash - previously untested... https://github.com/Perl/perl5/commit/8937088f79a80f8c 2) Most of the other uses in the core probably weren't exposed to tied hashes, but would crash similarly if one arrived, because they dereference HvARRAY(), "knowing" that it can't be NULL... so ultimately you can't have a sane HvIS_EMPTY() without also migrating the following code from explicit iteration of HvARRAY() to an API that (also) handles tied hashes. So, the answer to the actual question about CPAN - I think that modules partition roughly like this 1) HvIS_EMPTY Sub-Disable 2) just replace HvARRAY iteration with hv_foreach: Class-Accessor-Inherited-XS Data-Recursive Devel-Arena (but build fails on blead) JavaBin 3) HvIS_EMPTY and hv_foreach Devel-Size Internals-DumpArenas namespace-clean-xs Sub-Disable 4) HvIS_EMPTY and hv_foreach, but needs special handling for PL_strtab Devel-MAT-Dumper Devel-SizeMe (but build fails on blead) 5) Just remove the existing code... Cairo Glib (these both "inherit" the code from pp_defined, but are only interested in scalars) perlec (but build fails on blead) autovivification just make it return 0 - it also "borrows" the code from pp_defined. Whilst it does pass that code AVs and HVs, it takes equivalent (and correct) code paths if it doesn't check AvARRAY and HvARRAY. The change would mean static OP *a_pp_root_unop(pTHX) static OP *a_pp_root_binop(pTHX) would instead hit: return oi->old_pp(aTHX); 6) Unclear, and don't pass tests on blead B-C - unclear if anything needs fixing - tests have failed for some time 7) Unclear, and don't build on blead JSPL - wraps obsolete spidermonkey implementation mod_perl - modperl_perl_hv_fetch_he - could be replaced with hv_common? 8) Need code changes Hash-Spy - it's swapping the guts of hashes around. It needs different code. Sereal-Encoder - might technically be possible with hv_foreach, but I'd already prototyped "different code" Storable - ditto threads-tbb - doesn't build (arm32)/some tests abort with coredumps (x86_64) uses interfaces Intel have marked as deprecated... *as written* it would need "new code" for its iterator, but it could *probably* be refactored to the right "shape" for hv_foreach. However, I don't see a future for the module... XS-Framework - Has two uses of HvARRAY. The macro XS_HV_ITER effectively *is* hv_foreach Hash.begin()/end() - I think that these can be directly mapped to the existing HvABH API (Might need to add an "iterator is equal" function to enable comparisons with Hash.end(), but the ABH iterator API was purposefully pretty close to the C++ STL iteration "shape") > I'm nervous that there are enough uses of the various affected structures > on CPAN that it seems likely there'll be yet more in the DarkPAN; I think > we should aim to mitigate as much as possible in advance with a combination > of new interfaces and deprecations. I think that it's 23 distributions out of 2726 on CPAN that have XS code. To me it looks like many of the ones in that list are doing "other" internals things that are sufficiently invasive that they're all at some risk of breaking on any major release. Of all the companies that I've worked for (or seen inside) I think that only two have had custom XS code, and none of it was dealing with data structures more complex than scalars. In both cases they were also competent enough to build their own Perl binary rather than using whatever the OS gave them, and competent enough to fix their own XS code. So 1) I'm really not sure how much *XS* code there is in places we can't see 2) It's XS code. It won't install if it won't compile 3) The changes we need here are going to be compile time failures > Additionally there is a fair amount of discussion in the notes about > randomization to avoid algorithmic complexity attacks, but I see none > about the converse - turning off the randomization to allow reproducible > behaviour, eg for debugging. I think that's also important, and also > merits some commentary. Agree. The branch was the "proof of concept". Can I even make it work? This part would follow logically from cleaning up the existing hash randomisation setup, which I'd not done either. Nicholas ClarkThread Previous | Thread Next