develooper Front page | perl.perl5.porters | Postings from March 2013

Why HvFILL (was Re: [perl.git] branch smoke-me/nicholas/lazy-hv-fill,updated. v5.17.9-167-g040e18a)

Thread Next
From:
demerphq
Date:
March 19, 2013 11:22
Subject:
Why HvFILL (was Re: [perl.git] branch smoke-me/nicholas/lazy-hv-fill,updated. v5.17.9-167-g040e18a)
Message ID:
CANgJU+VBhn071Z_Pse4CGXTVLB6_Z3akedE+HkOFSA_AYL=WKA@mail.gmail.com
Repositing as I have a feeling it got warnocked due to mail filters....


Yves


On 14 March 2013 15:17, demerphq <demerphq@gmail.com> wrote:
> Why do we care about HvFILL?
>
> We already are providing better info that HvFILL can via Hash::Util so
> why bother supporting HvFILL at all?
>
> As far as I know we dont actually use HvFILL in core. ..... And I just
> checked again, the only place we use HvFILL in the core is on:
>
> my %new_hv= %old_hv;
>
> and we do it as an optimization to see if we can reduce the number of
> buckets. This doesn't really make sense, the proper thing IMO to use
> for such an optimization is HvTOTALKEYS instead.
>
> As such IMO the most intelligent thing we can do about HvFILL is
> remove it outright.
>
> Yves
>
> On 14 March 2013 14:55, Nicholas Clark <nick@ccl4.org> wrote:
>> In perl.git, the branch smoke-me/nicholas/lazy-hv-fill has been updated
>>
>> <http://perl5.git.perl.org/perl.git/commitdiff/040e18a11c041f7f3624df77ad51b8dc6a64facf?hp=844cbf57107cbabdbb4e4f0ce403d0fde31b9a66>
>>
>> - Log -----------------------------------------------------------------
>> commit 040e18a11c041f7f3624df77ad51b8dc6a64facf
>> Author: Nicholas Clark <nick@ccl4.org>
>> Date:   Mon Mar 11 11:42:32 2013 +0000
>>
>>     Cache HvFILL() for larger hashes, and update on insertion/deletion.
>>
>>     This avoids HvFILL() being O(n) for large n on large hashes, but also avoids
>>     storing the value of HvFILL() in smaller hashes (ie a memory overhead on
>>     every single object built using a hash.)
>>
>> M       dump.c
>> M       embed.fnc
>> M       ext/Devel-Peek/t/Peek.t
>> M       hv.c
>> M       hv.h
>> M       proto.h
>> M       sv.c
>>
>> commit 71dc3f31560ee9e28c9280c8304074c8c07f3b90
>> Author: Nicholas Clark <nick@ccl4.org>
>> Date:   Mon Mar 11 11:18:11 2013 +0000
>>
>>     Perl_hv_fill() can return early if the hash only has 0 or 1 keys.
>>
>>     No keys implies no chains used, so the return value is 0. One key
>>     unambiguously means 1 chain used, and all the others are free. Two or more
>>     keys might share the same chain, or might not, so the calculation can't be
>>     short-circuited.
>>
>> M       hv.c
>>
>> commit 6179f3c98658ee03c9d9778166f58f45cf5684f4
>> Author: Nicholas Clark <nick@ccl4.org>
>> Date:   Mon Mar 11 10:32:12 2013 +0000
>>
>>     Tests for hashes in scalar context (and hence HvFILL()).
>>
>> M       t/op/hash.t
>> -----------------------------------------------------------------------
>>
>> Summary of changes:
>>  dump.c                  |   25 ++++++++++-
>>  embed.fnc               |    2 +-
>>  ext/Devel-Peek/t/Peek.t |  117 +++++++++++++++++++++++++++++++++++++++++++++-
>>  hv.c                    |   86 +++++++++++++++++++++++++++-------
>>  hv.h                    |    3 +-
>>  proto.h                 |    2 +-
>>  sv.c                    |    1 +
>>  t/op/hash.t             |   90 +++++++++++++++++++++++++++++++++++-
>>  8 files changed, 300 insertions(+), 26 deletions(-)
>>
>> diff --git a/dump.c b/dump.c
>> index fcc63fc..13f7f7e 100644
>> --- a/dump.c
>> +++ b/dump.c
>> @@ -1799,7 +1799,30 @@ Perl_do_sv_dump(pTHX_ I32 level, PerlIO *file, SV *sv, I32 nest, I32 maxnest, bo
>>         }
>>         PerlIO_putc(file, '\n');
>>         Perl_dump_indent(aTHX_ level, file, "  KEYS = %"IVdf"\n", (IV)HvUSEDKEYS(sv));
>> -       Perl_dump_indent(aTHX_ level, file, "  FILL = %"IVdf"\n", (IV)HvFILL(sv));
>> +        {
>> +            STRLEN count = 0;
>> +            HE **ents = HvARRAY(sv);
>> +
>> +            if (ents) {
>> +                HE *const *const last = ents + HvMAX(sv);
>> +                count = last + 1 - ents;
>> +
>> +                do {
>> +                    if (!*ents)
>> +                        --count;
>> +                } while (++ents <= last);
>> +            }
>> +
>> +            if (SvOOK(sv)) {
>> +                struct xpvhv_aux *const aux = HvAUX(sv);
>> +                Perl_dump_indent(aTHX_ level, file, "  FILL = %"UVuf
>> +                                 " (cached = %"UVuf")\n",
>> +                                 (UV)count, aux->xhv_fill_lazy);
>> +            } else {
>> +                Perl_dump_indent(aTHX_ level, file, "  FILL = %"UVuf"\n",
>> +                                 (UV)count);
>> +            }
>> +        }
>>         Perl_dump_indent(aTHX_ level, file, "  MAX = %"IVdf"\n", (IV)HvMAX(sv));
>>         Perl_dump_indent(aTHX_ level, file, "  RITER = %"IVdf"\n", (IV)HvRITER_get(sv));
>>         Perl_dump_indent(aTHX_ level, file, "  EITER = 0x%"UVxf"\n", PTR2UV(HvEITER_get(sv)));
>> diff --git a/embed.fnc b/embed.fnc
>> index 2f5e089..e6972d0 100644
>> --- a/embed.fnc
>> +++ b/embed.fnc
>> @@ -529,7 +529,7 @@ Ap  |void*  |hv_common      |NULLOK HV *hv|NULLOK SV *keysv \
>>  Ap     |void*  |hv_common_key_len|NULLOK HV *hv|NN const char *key \
>>                                 |I32 klen_i32|const int action|NULLOK SV *val \
>>                                 |const U32 hash
>> -Apod   |STRLEN |hv_fill        |NN HV const *const hv
>> +Apod   |STRLEN |hv_fill        |NN HV *const hv
>>  Ap     |void   |hv_free_ent    |NN HV *hv|NULLOK HE *entry
>>  Apd    |I32    |hv_iterinit    |NN HV *hv
>>  ApdR   |char*  |hv_iterkey     |NN HE* entry|NN I32* retlen
>> diff --git a/ext/Devel-Peek/t/Peek.t b/ext/Devel-Peek/t/Peek.t
>> index 9a0e37c..5bc4cea 100644
>> --- a/ext/Devel-Peek/t/Peek.t
>> +++ b/ext/Devel-Peek/t/Peek.t
>> @@ -745,7 +745,7 @@ do_test('ENAME on a stash',
>>      NV = $FLOAT                                        # $] < 5.009
>>      ARRAY = $ADDR
>>      KEYS = 0
>> -    FILL = 0
>> +    FILL = 0 \(cached = 0\)
>>      MAX = 7
>>      RITER = -1
>>      EITER = 0x0
>> @@ -768,7 +768,7 @@ do_test('ENAMEs on a stash',
>>      NV = $FLOAT                                        # $] < 5.009
>>      ARRAY = $ADDR
>>      KEYS = 0
>> -    FILL = 0
>> +    FILL = 0 \(cached = 0\)
>>      MAX = 7
>>      RITER = -1
>>      EITER = 0x0
>> @@ -793,7 +793,7 @@ do_test('ENAMEs on a stash with no NAME',
>>      NV = $FLOAT                                        # $] < 5.009
>>      ARRAY = $ADDR
>>      KEYS = 0
>> -    FILL = 0
>> +    FILL = 0 \(cached = 0\)
>>      MAX = 7
>>      RITER = -1
>>      EITER = 0x0
>> @@ -801,6 +801,117 @@ do_test('ENAMEs on a stash with no NAME',
>>      ENAME = "RWOM", "KLANK"                    # $] > 5.012
>>  ');
>>
>> +my %small = ("Perl", "Rules", "Beer", "Foamy");
>> +my $b = %small;
>> +do_test('small hash',
>> +        \%small,
>> +'SV = $RV\\($ADDR\\) at $ADDR
>> +  REFCNT = 1
>> +  FLAGS = \\(ROK\\)
>> +  RV = $ADDR
>> +  SV = PVHV\\($ADDR\\) at $ADDR
>> +    REFCNT = 2
>> +    FLAGS = \\(PADMY,SHAREKEYS\\)
>> +    IV = 1                                     # $] < 5.009
>> +    NV = $FLOAT                                        # $] < 5.009
>> +    ARRAY = $ADDR  \\(0:[67],.*\\)
>> +    hash quality = [0-9.]+%
>> +    KEYS = 2
>> +    FILL = [12]
>> +    MAX = 7
>> +    RITER = -1
>> +    EITER = 0x0
>> +(?:    Elt "(?:Perl|Beer)" HASH = $ADDR
>> +    SV = PV\\($ADDR\\) at $ADDR
>> +      REFCNT = 1
>> +      FLAGS = \\(POK,pPOK\\)
>> +      PV = $ADDR "(?:Rules|Foamy)"\\\0
>> +      CUR = \d+
>> +      LEN = \d+
>> +){2}');
>> +
>> +$b = keys %small;
>> +
>> +do_test('small hash after keys',
>> +        \%small,
>> +'SV = $RV\\($ADDR\\) at $ADDR
>> +  REFCNT = 1
>> +  FLAGS = \\(ROK\\)
>> +  RV = $ADDR
>> +  SV = PVHV\\($ADDR\\) at $ADDR
>> +    REFCNT = 2
>> +    FLAGS = \\(PADMY,OOK,SHAREKEYS\\)
>> +    IV = 1                                     # $] < 5.009
>> +    NV = $FLOAT                                        # $] < 5.009
>> +    ARRAY = $ADDR  \\(0:[67],.*\\)
>> +    hash quality = [0-9.]+%
>> +    KEYS = 2
>> +    FILL = [12] \\(cached = 0\\)
>> +    MAX = 7
>> +    RITER = -1
>> +    EITER = 0x0
>> +(?:    Elt "(?:Perl|Beer)" HASH = $ADDR
>> +    SV = PV\\($ADDR\\) at $ADDR
>> +      REFCNT = 1
>> +      FLAGS = \\(POK,pPOK\\)
>> +      PV = $ADDR "(?:Rules|Foamy)"\\\0
>> +      CUR = \d+
>> +      LEN = \d+
>> +){2}');
>> +
>> +$b = %small;
>> +
>> +do_test('small hash after keys and scalar',
>> +        \%small,
>> +'SV = $RV\\($ADDR\\) at $ADDR
>> +  REFCNT = 1
>> +  FLAGS = \\(ROK\\)
>> +  RV = $ADDR
>> +  SV = PVHV\\($ADDR\\) at $ADDR
>> +    REFCNT = 2
>> +    FLAGS = \\(PADMY,OOK,SHAREKEYS\\)
>> +    IV = 1                                     # $] < 5.009
>> +    NV = $FLOAT                                        # $] < 5.009
>> +    ARRAY = $ADDR  \\(0:[67],.*\\)
>> +    hash quality = [0-9.]+%
>> +    KEYS = 2
>> +    FILL = ([12]) \\(cached = \1\\)
>> +    MAX = 7
>> +    RITER = -1
>> +    EITER = 0x0
>> +(?:    Elt "(?:Perl|Beer)" HASH = $ADDR
>> +    SV = PV\\($ADDR\\) at $ADDR
>> +      REFCNT = 1
>> +      FLAGS = \\(POK,pPOK\\)
>> +      PV = $ADDR "(?:Rules|Foamy)"\\\0
>> +      CUR = \d+
>> +      LEN = \d+
>> +){2}');
>> +
>> +# This should immediately start with the FILL cached correctly.
>> +my %large = (0..1999);
>> +$b = %large;
>> +do_test('large hash',
>> +        \%large,
>> +'SV = $RV\\($ADDR\\) at $ADDR
>> +  REFCNT = 1
>> +  FLAGS = \\(ROK\\)
>> +  RV = $ADDR
>> +  SV = PVHV\\($ADDR\\) at $ADDR
>> +    REFCNT = 2
>> +    FLAGS = \\(PADMY,OOK,SHAREKEYS\\)
>> +    IV = 1                                     # $] < 5.009
>> +    NV = $FLOAT                                        # $] < 5.009
>> +    ARRAY = $ADDR  \\(0:\d+,.*\\)
>> +    hash quality = \d+\\.\d+%
>> +    KEYS = 1000
>> +    FILL = (\d+) \\(cached = \1\\)
>> +    MAX = 1023
>> +    RITER = -1
>> +    EITER = 0x0
>> +    Elt .*
>> +');
>> +
>>  SKIP: {
>>      skip "Not built with usemymalloc", 1
>>        unless $Config{usemymalloc} eq 'y';
>> diff --git a/hv.c b/hv.c
>> index 809bd00..a60a922 100644
>> --- a/hv.c
>> +++ b/hv.c
>> @@ -36,6 +36,7 @@ holds the key and hash value.
>>  #include "perl.h"
>>
>>  #define DO_HSPLIT(xhv) ((xhv)->xhv_keys > (xhv)->xhv_max) /* HvTOTALKEYS(hv) > HvMAX(hv) */
>> +#define HV_FILL_THRESHOLD 31
>>
>>  static const char S_strtab_error[]
>>      = "Cannot modify shared string table in hv_%s";
>> @@ -795,6 +796,12 @@ Perl_hv_common(pTHX_ HV *hv, SV *keysv, const char *key, STRLEN klen,
>>         HvHASKFLAGS_on(hv);
>>
>>      xhv->xhv_keys++; /* HvTOTALKEYS(hv)++ */
>> +    if (!HeNEXT(entry) && SvOOK(hv)) {
>> +        /* initial entry, and aux struct present.  */
>> +        struct xpvhv_aux *const aux = HvAUX(hv);
>> +        if (aux->xhv_fill_lazy)
>> +            ++aux->xhv_fill_lazy;
>> +    }
>>      if ( DO_HSPLIT(xhv) ) {
>>          const STRLEN oldsize = xhv->xhv_max + 1;
>>          const U32 items = (U32)HvPLACEHOLDERS_get(hv);
>> @@ -905,6 +912,7 @@ S_hv_delete_common(pTHX_ HV *hv, SV *keysv, const char *key, STRLEN klen,
>>      XPVHV* xhv;
>>      HE *entry;
>>      HE **oentry;
>> +    HE *const *first_entry;
>>      bool is_utf8 = (k_flags & HVhek_UTF8) ? TRUE : FALSE;
>>      int masked_flags;
>>
>> @@ -980,7 +988,7 @@ S_hv_delete_common(pTHX_ HV *hv, SV *keysv, const char *key, STRLEN klen,
>>
>>      masked_flags = (k_flags & HVhek_MASK);
>>
>> -    oentry = &(HvARRAY(hv))[hash & (I32) HvMAX(hv)];
>> +    first_entry = oentry = &(HvARRAY(hv))[hash & (I32) HvMAX(hv)];
>>      entry = *oentry;
>>      for (; entry; oentry = &HeNEXT(entry), entry = *oentry) {
>>         SV *sv;
>> @@ -1068,6 +1076,12 @@ S_hv_delete_common(pTHX_ HV *hv, SV *keysv, const char *key, STRLEN klen,
>>             HvPLACEHOLDERS(hv)++;
>>         else {
>>             *oentry = HeNEXT(entry);
>> +            if(!*first_entry && SvOOK(hv)) {
>> +                /* removed last entry, and aux struct present.  */
>> +                struct xpvhv_aux *const aux = HvAUX(hv);
>> +                if (aux->xhv_fill_lazy)
>> +                    --aux->xhv_fill_lazy;
>> +            }
>>             if (SvOOK(hv) && entry == HvAUX(hv)->xhv_eiter /* HvEITER(hv) */)
>>                 HvLAZYDEL_on(hv);
>>             else {
>> @@ -1124,7 +1138,13 @@ S_hsplit(pTHX_ HV *hv, STRLEN const oldsize, STRLEN newsize)
>>        return;
>>      }
>>      if (SvOOK(hv)) {
>> +        struct xpvhv_aux *const aux
>> +            = (struct xpvhv_aux *)&a[newsize * sizeof(HE*)];
>>         Move(&a[oldsize * sizeof(HE*)], &a[newsize * sizeof(HE*)], 1, struct xpvhv_aux);
>> +        /* For now, just reset the lazy fill counter.
>> +           It would be possible to update the counter in the code below
>> +           instead.  */
>> +        aux->xhv_fill_lazy = 0;
>>      }
>>
>>      PL_nomemok = FALSE;
>> @@ -1561,19 +1581,25 @@ Perl_hfree_next_entry(pTHX_ HV *hv, STRLEN *indexp)
>>
>>      PERL_ARGS_ASSERT_HFREE_NEXT_ENTRY;
>>
>> -    if (SvOOK(hv) && ((iter = HvAUX(hv)))
>> -       && ((entry = iter->xhv_eiter)) )
>> -    {
>> -       /* the iterator may get resurrected after each
>> -        * destructor call, so check each time */
>> -       if (entry && HvLAZYDEL(hv)) {   /* was deleted earlier? */
>> -           HvLAZYDEL_off(hv);
>> -           hv_free_ent(hv, entry);
>> -           /* warning: at this point HvARRAY may have been
>> -            * re-allocated, HvMAX changed etc */
>> -       }
>> -       iter->xhv_riter = -1;   /* HvRITER(hv) = -1 */
>> -       iter->xhv_eiter = NULL; /* HvEITER(hv) = NULL */
>> +    if (SvOOK(hv) && ((iter = HvAUX(hv)))) {
>> +        if ((entry = iter->xhv_eiter)) {
>> +            /* the iterator may get resurrected after each
>> +             * destructor call, so check each time */
>> +            if (entry && HvLAZYDEL(hv)) {       /* was deleted earlier? */
>> +                HvLAZYDEL_off(hv);
>> +                hv_free_ent(hv, entry);
>> +                /* warning: at this point HvARRAY may have been
>> +                 * re-allocated, HvMAX changed etc */
>> +            }
>> +            iter->xhv_riter = -1;       /* HvRITER(hv) = -1 */
>> +            iter->xhv_eiter = NULL;     /* HvEITER(hv) = NULL */
>> +        }
>> +        /* Reset any cached HvFILL() to "unknown".  It's unlikely that anyone
>> +           will actually call HvFILL() on a hash under destruction, so it
>> +           seems pointless attempting to track the number of keys remaining.
>> +           But if they do, we want to reset it again.  */
>> +        if (iter->xhv_fill_lazy)
>> +            iter->xhv_fill_lazy = 0;
>>      }
>>
>>      if (!((XPVHV*)SvANY(hv))->xhv_keys)
>> @@ -1731,20 +1757,35 @@ Perl_hv_undef_flags(pTHX_ HV *hv, U32 flags)
>>  Returns the number of hash buckets that happen to be in use. This function is
>>  wrapped by the macro C<HvFILL>.
>>
>> -Previously this value was stored in the HV structure, rather than being
>> -calculated on demand.
>> +Previously this value was always stored in the HV structure, which created an
>> +overhead on every hash (and pretty much every object) for something that was
>> +rarely used. Now we calculate it on demand the first time that it is needed,
>> +and cache it if that calculation is going to be costly to repeat. The cached
>> +value is updated by insertions and deletions, but (currently) discarded if
>> +the hash is split.
>>
>>  =cut
>>  */
>>
>>  STRLEN
>> -Perl_hv_fill(pTHX_ HV const *const hv)
>> +Perl_hv_fill(pTHX_ HV *const hv)
>>  {
>>      STRLEN count = 0;
>>      HE **ents = HvARRAY(hv);
>> +    struct xpvhv_aux *aux = SvOOK(hv) ? HvAUX(hv) : NULL;
>>
>>      PERL_ARGS_ASSERT_HV_FILL;
>>
>> +    /* No keys implies no buckets used.
>> +       One key can only possibly mean one bucket used.  */
>> +    if (HvTOTALKEYS(hv) < 2)
>> +        return HvTOTALKEYS(hv);
>> +
>> +#ifndef DEBUGGING
>> +    if (aux && aux->xhv_fill_lazy)
>> +        return aux->xhv_fill_lazy;
>> +#endif
>> +
>>      if (ents) {
>>         HE *const *const last = ents + HvMAX(hv);
>>         count = last + 1 - ents;
>> @@ -1754,6 +1795,16 @@ Perl_hv_fill(pTHX_ HV const *const hv)
>>                 --count;
>>         } while (++ents <= last);
>>      }
>> +    if (aux) {
>> +#ifdef DEBUGGING
>> +        if (aux->xhv_fill_lazy)
>> +            assert(aux->xhv_fill_lazy == count);
>> +#endif
>> +        aux->xhv_fill_lazy = count;
>> +    } else if (HvMAX(hv) >= HV_FILL_THRESHOLD) {
>> +        aux = hv_auxinit(hv);
>> +        aux->xhv_fill_lazy = count;
>> +    }
>>      return count;
>>  }
>>
>> @@ -1778,6 +1829,7 @@ S_hv_auxinit(HV *hv) {
>>
>>      iter->xhv_riter = -1;      /* HvRITER(hv) = -1 */
>>      iter->xhv_eiter = NULL;    /* HvEITER(hv) = NULL */
>> +    iter->xhv_fill_lazy = 0;
>>      iter->xhv_name_u.xhvnameu_name = 0;
>>      iter->xhv_name_count = 0;
>>      iter->xhv_backreferences = 0;
>> diff --git a/hv.h b/hv.h
>> index 3ee2399..beb8a1b 100644
>> --- a/hv.h
>> +++ b/hv.h
>> @@ -82,6 +82,7 @@ struct xpvhv_aux {
>>      AV         *xhv_backreferences; /* back references for weak references */
>>      HE         *xhv_eiter;     /* current entry of iterator */
>>      I32                xhv_riter;      /* current root of iterator */
>> +    U32         xhv_fill_lazy;
>>
>>  /* Concerning xhv_name_count: When non-zero, xhv_name_u contains a pointer
>>   * to an array of HEK pointers, this being the length. The first element is
>> @@ -784,7 +785,7 @@ C<SV*>.
>>  #  define Nullhv Null(HV*)
>>  #endif
>>  #define HvARRAY(hv)    ((hv)->sv_u.svu_hash)
>> -#define HvFILL(hv)     Perl_hv_fill(aTHX_ (const HV *)(hv))
>> +#define HvFILL(hv)     Perl_hv_fill(aTHX_ MUTABLE_HV(hv))
>>  #define HvMAX(hv)      ((XPVHV*)  SvANY(hv))->xhv_max
>>  /* This quite intentionally does no flag checking first. That's your
>>     responsibility.  */
>> diff --git a/proto.h b/proto.h
>> index 35d49db..3f8d44f 100644
>> --- a/proto.h
>> +++ b/proto.h
>> @@ -1548,7 +1548,7 @@ PERL_CALLCONV void        Perl_hv_ename_delete(pTHX_ HV *hv, const char *name, U32 len,
>>  #define PERL_ARGS_ASSERT_HV_FETCH_ENT  \
>>         assert(keysv)
>>
>> -PERL_CALLCONV STRLEN   Perl_hv_fill(pTHX_ HV const *const hv)
>> +PERL_CALLCONV STRLEN   Perl_hv_fill(pTHX_ HV *const hv)
>>                         __attribute__nonnull__(pTHX_1);
>>  #define PERL_ARGS_ASSERT_HV_FILL       \
>>         assert(hv)
>> diff --git a/sv.c b/sv.c
>> index 3f68d9c..3d7ed8b 100644
>> --- a/sv.c
>> +++ b/sv.c
>> @@ -12349,6 +12349,7 @@ S_sv_dup_common(pTHX_ const SV *const sstr, CLONE_PARAMS *const param)
>>                         }
>>                         daux->xhv_name_count = saux->xhv_name_count;
>>
>> +                       daux->xhv_fill_lazy = saux->xhv_fill_lazy;
>>                         daux->xhv_riter = saux->xhv_riter;
>>                         daux->xhv_eiter = saux->xhv_eiter
>>                             ? he_dup(saux->xhv_eiter,
>> diff --git a/t/op/hash.t b/t/op/hash.t
>> index 597301a..e0e8021 100644
>> --- a/t/op/hash.t
>> +++ b/t/op/hash.t
>> @@ -8,8 +8,6 @@ BEGIN {
>>
>>  use strict;
>>
>> -plan tests => 10;
>> -
>>  # This will crash perl if it fails
>>
>>  use constant PVBM => 'foo';
>> @@ -117,3 +115,91 @@ pass 'no crash when freeing hash that is being undeffed';
>>  $::ra = {a=>bless [], 'A'};
>>  %$::ra = ('a'..'z');
>>  pass 'no crash when freeing hash that is being exonerated, ahem, cleared';
>> +
>> +# If I have these correct then removing any part of the lazy hash fill handling
>> +# code in hv.c will cause some of these tests to start failing.
>> +sub validate_hash {
>> +  my ($desc, $h) = @_;
>> +  local $::Level = $::Level + 1;
>> +
>> +  my $scalar = %$h;
>> +  my $expect = qr!\A(\d+)/(\d+)\z!;
>> +  like($scalar, $expect, "$desc in scalar context matches pattern");
>> +  my ($used, $total) = $scalar =~ $expect;
>> +  cmp_ok($total, '>', 0, "$desc has >0 array size ($total)");
>> +  cmp_ok($used, '>', 0, "$desc uses >0 heads ($used)");
>> +  cmp_ok($used, '<=', $total,
>> +         "$desc doesn't use more heads than are available");
>> +  return ($used, $total);
>> +}
>> +
>> +sub torture_hash {
>> +  my $desc = shift;
>> +  # Intentionally use an anon hash rather than a lexical, as lexicals default
>> +  # to getting reused on subsequent calls
>> +  my $h = {};
>> +  ++$h->{$_} foreach @_;
>> +
>> +  my ($used0, $total0) = validate_hash($desc, $h);
>> +  # Remove half the keys each time round, until there are only 1 or 2 left
>> +  my @groups;
>> +  my ($h2, $h3, $h4);
>> +  while (keys %$h > 2) {
>> +    my $take = (keys %$h) / 2 - 1;
>> +    my @keys = (keys %$h)[0 .. $take];
>> +    my $scalar = %$h;
>> +    delete @$h{@keys};
>> +    push @groups, $scalar, \@keys;
>> +
>> +    my $count = keys %$h;
>> +    my ($used, $total) = validate_hash("$desc (-$count)", $h);
>> +    is($total, $total0, "$desc ($count) has same array size");
>> +    cmp_ok($used, '<=', $used0, "$desc ($count) has same or fewer heads");
>> +    ++$h2->{$_} foreach @keys;
>> +    my (undef, $total2) = validate_hash("$desc (+$count)", $h2);
>> +    cmp_ok($total2, '<=', $total0, "$desc ($count) array size no larger");
>> +
>> +    # Each time this will get emptied then repopulated. If the fill isn't reset
>> +    # when the hash is emptied, the used count will likely exceed the array
>> +    %$h3 = %$h2;
>> +    my (undef, $total3) = validate_hash("$desc (+$count copy)", $h3);
>> +    is($total3, $total2, "$desc (+$count copy) has same array size");
>> +
>> +    # This might use fewer buckets than the original
>> +    %$h4 = %$h;
>> +    my (undef, $total4) = validate_hash("$desc ($count copy)", $h4);
>> +    cmp_ok($total4, '<=', $total0, "$desc ($count copy) array size no larger");
>> +  }
>> +
>> +  my $scalar = %$h;
>> +  my @keys = keys %$h;
>> +  delete @$h{@keys};
>> +  is(scalar %$h, 0, "scalar keys for empty $desc");
>> +
>> +  # Rebuild the original hash, and build a copy
>> +  # These will fail if hash key addition and deletion aren't handled correctly
>> +  my $h1;
>> +  foreach (@keys) {
>> +    ++$h->{$_};
>> +    ++$h1->{$_};
>> +  }
>> +  is(scalar %$h, $scalar, "scalar keys restored when rebuilding");
>> +
>> +  while (@groups) {
>> +    my $keys = pop @groups;
>> +    ++$h->{$_} foreach @$keys;
>> +    my (undef, $total) = validate_hash("$desc " . keys %$h, $h);
>> +    is($total, $total0, "bucket count is constant when rebuilding");
>> +    is(scalar %$h, pop @groups, "scalar keys is identical when rebuilding");
>> +    ++$h1->{$_} foreach @$keys;
>> +    validate_hash("$desc copy " . keys %$h1, $h1);
>> +  }
>> +  # This will fail if the fill count isn't handled correctly on hash split
>> +  is(scalar %$h1, scalar %$h, "scalar keys is identical on copy and original");
>> +}
>> +
>> +torture_hash('a .. zz', 'a' .. 'zz');
>> +torture_hash('0 .. 9', 0 .. 9);
>> +torture_hash("'Perl'", 'Rules');
>> +
>> +done_testing();
>>
>> --
>> Perl5 Master Repository
>
>
>
> --
> perl -Mre=debug -e "/just|another|perl|hacker/"



-- 
perl -Mre=debug -e "/just|another|perl|hacker/"

Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About