develooper Front page | perl.perl5.porters | Postings from June 2017

Re: RFC: Add new string comparison macros in handy.h

Thread Previous | Thread Next
Karl Williamson
June 2, 2017 16:16
Re: RFC: Add new string comparison macros in handy.h
Message ID:
On 06/02/2017 06:46 AM, demerphq wrote:
> On 2 June 2017 at 13:30, Ævar Arnfjörð Bjarmason <> wrote:
>> On Thu, May 11, 2017 at 5:22 PM, Karl Williamson
>> <> wrote:
>>>      memSTARTS_WITHs
>>>              Test if the string buffer "s1" with length "l1" begins with the
>>>              substring given by the string literal "s2", returning non-zero
>>> if
>>>              so (including if the two are identical); zero otherwise. The
>>>              comparison does not include the final "NUL" of "s2". "s1" does
>>> not
>>>              have to be "NUL"-terminated,
>>>                      bool    memSTARTS_WITHs(char* s1, STRLEN l1, char* s2)
>> I don't have to use these and don't really care, but just a question:
>> Is there a reason for why the prototype for the the mem* functions
>> doesn't also pass the STRLEN for the needle as well as the haystack?
> The whole point of the 's' family macros is to handle cases where one
> of the arguments is a constant string in the C code, and therefore the
> length can be computed by the macro. In other words cases like this:
> STRLEN len;
> char *pv= SvPV(thing,len);
> if (memSTARTS_WITHs(pv,len,"someprefix")) { ... }
> That is why I mentioned the variants I did, which I will relist with
> better arguments:
> strIS_EQ(pv,pv)
> strIS_EQs(pv,"string")
> strIS_EQls(pv,len,"string")
> strIS_EQl(pv,len,pv)
> strIS_EQll(pv,len,pv,len)
>> Right now the interface only allows the haystack not the needle to
>> contain \0, which seems like a needless arbitrary limitation for
>> something that's essentially a fancy strstr() & memmem(). I.e. you
>> have feature-parity with strstr() (and extra features like "begins
>> with?"), but not with memmem().
> With the 's' macros we know the length of the string by using
> sizeof(). The 's' macros are composed of the STR_WITH_LEN() macro
> trick:
> #define STR_WITH_LEN(s) "" s "", sizeof(s)-1
> the "" s "" thing guarantees the argument is a C string, not a
> pointer, and the sizeof(s)-1 tells us its length.
> With the api I proposed in a reply to Karl the 'll' variants would
> cover the cases you are thinking of.
> To recap and refine that proposal:
> (mem|str)IS_(PREFIX_|SUFFIX_)?(EQ|NE|LT|GT|GE|LE)[ls]*
> More specifically the suffixes would be:
> '' :  none, both arguments are pv's without a length.
> 's': second argument is a constant string
> 'l' : first argument has a length, second argument is a pointer
> 'ls': first argument has a length, second argument is a constant string
> 'll':  both arguments are char *'s and have lengths.
> Not all suffixes would apply to 'mem', but i think they all apply to 'str'.
> Whether we should have 'str' at all is a different question.
> cheers,
> Yves

I haven't had a chance to fully evaluate this, but a couple of quick 

Yes, we do need 'str'.  There are a bunch of places where the length is 
not known, and one of the arguments is a C string (and so for most 
purposes the other argument not being a C string turns out to not affect 
the result).

I went through the core, and the existing macros plus the ones I 
proposed are sufficient to handle the existing cases and make the code 
much easier to grok without detailed examination.

They are also easier to program right, as the coder doesn't have to 
count the length manually.

The ENDS with is used in a few places, like seeing if a path ends in '.pm'

I'm not a fan of the trailing 's' in the name mean a literal string. 
Both arguments are always strings.  I had thought 'l' for 'literal', but 
you have used that one up.  Maybe 'q' for 'quoted'

I also have never liked a prefix IS (or 'is').  It's just extra typing 
that doesn't really help readability.

Thread Previous | Thread Next Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at | Group listing | About