develooper Front page | perl.perl5.porters | Postings from June 2017

Re: RFC: Add new string comparison macros in handy.h

Thread Previous | Thread Next
From:
demerphq
Date:
June 1, 2017 20:53
Subject:
Re: RFC: Add new string comparison macros in handy.h
Message ID:
CANgJU+XaShTLPLFEuZ_ShcA9eL8uOrhZ3J644VHyKrHK7JY7bA@mail.gmail.com
On 11 May 2017 at 17:22, Karl Williamson <public@khwilliamson.com> wrote:
> I would like to add the macros given below to handy.h.  The situations they
> handle occur reasonably frequently in the core, and these can save
> developers from thinking they have to manually count the characters in a
> string.
>
> I am not confident at all about the names, and would like to see if people
> have better ones.

I think creating a new set of macros with clearer names is a good
idea, but  how easy is it for us to deprecate the old ones?

I wanted to give a summary of the history at stake here:

We have had the following macros since the history of perl:

^8d063cd (Larry Wall               1987-12-18 00:00:00 +0000  478)
#define strNE(s1,s2) (strcmp(s1,s
^8d063cd (Larry Wall               1987-12-18 00:00:00 +0000  479)
#define strEQ(s1,s2) (!strcmp(s1,
^8d063cd (Larry Wall               1987-12-18 00:00:00 +0000  480)
#define strLT(s1,s2) (strcmp(s1,s
^8d063cd (Larry Wall               1987-12-18 00:00:00 +0000  481)
#define strLE(s1,s2) (strcmp(s1,s
^8d063cd (Larry Wall               1987-12-18 00:00:00 +0000  482)
#define strGT(s1,s2) (strcmp(s1,s
^8d063cd (Larry Wall               1987-12-18 00:00:00 +0000  483)
#define strGE(s1,s2) (strcmp(s1,s
^8d063cd (Larry Wall               1987-12-18 00:00:00 +0000  485)
#define strnNE(s1,s2,l) (strncmp(
^8d063cd (Larry Wall               1987-12-18 00:00:00 +0000  486)
#define strnEQ(s1,s2,l) (!strncmp

We have had these since 1996:

36477c24 (Perl 5 Porters           1996-12-06 18:56:00 +1200  497) #
define memNE(s1,s2,l) (memcmp(
36477c24 (Perl 5 Porters           1996-12-06 18:56:00 +1200  498) #
define memEQ(s1,s2,l)

We have had these since 2007:

568a785a (Nicholas Clark           2007-03-23 16:55:13 +0000  505)
#define memEQs(s1, l, s2) \
777fa2cb (Yves Orton               2016-10-19 10:32:29 +0200  506)
    (((sizeof(s2)-1) == (l))
568a785a (Nicholas Clark           2007-03-23 16:55:13 +0000  507)
#define memNEs(s1, l, s2) !memEQs

You added these in September 2016:

062b6850 (Karl Williamson          2016-09-10 08:54:36 -0600  515)
#define memLT(s1,s2,l) (memcmp(s1
062b6850 (Karl Williamson          2016-09-10 08:54:36 -0600  516)
#define memLE(s1,s2,l) (memcmp(s1
062b6850 (Karl Williamson          2016-09-10 08:54:36 -0600  517)
#define memGT(s1,s2,l) (memcmp(s1
062b6850 (Karl Williamson          2016-09-10 08:54:36 -0600  518)
#define memGE(s1,s2,l) (memcmp(s1

I added these in October 2016 (in a post I just send I realize they
were misnamed and should have been called strnNEs(), note the missing
'n' to comply with strnNE(). )

62946e08 (Yves Orton               2016-10-19 10:30:44 +0200  492)
#define strNEs(s1,s2) (strncmp(s1
62946e08 (Yves Orton               2016-10-19 10:30:44 +0200  493)
#define strEQs(s1,s2) (!strncmp(s

and these:

777fa2cb (Yves Orton               2016-10-19 10:32:29 +0200  511)
#define _memEQs(s1, s2) \
777fa2cb (Yves Orton               2016-10-19 10:32:29 +0200  512)
    (memEQ((s1), ("" s2 ""),
777fa2cb (Yves Orton               2016-10-19 10:32:29 +0200  513)
#define _memNEs(s1, s2) (memNE((s


> I also would like to document memEQs, memLE, memLT, memGE, and memGT. And
> move all similar macros to a new section, "String comparison functions",
> from the current "Miscellaneous".
>
>     strSTARTS_WITHs
>             Test if the "NUL"-terminated string "s1" begins with the
> substring
>             given by the string literal "s2", returning non-zero if so
>             (including if the two are identical); zero otherwise.
>
>                     bool    strSTARTS_WITHs(char* s1, char* s2)

So this is equivalent to the current strEQs().

To comply with existing convention strEQs() should be renamed strnEQs().

I think adding a long form equivalent is ok, but i think the old
naming convention (assuming the name is corrected to include the 'n')
make sense too.

>     memSTARTS_WITHs
>             Test if the string buffer "s1" with length "l1" begins with the
>             substring given by the string literal "s2", returning non-zero
> if
>             so (including if the two are identical); zero otherwise. The
>             comparison does not include the final "NUL" of "s2". "s1" does
> not
>             have to be "NUL"-terminated,

So the difference with the 'str' version is that str() considers a
null byte to be end of string, and mem() does not. Is there any case
where using memcmp() instead of str[n]cmp() is wrong for this type of
macro? If not maybe we should just have one (using memcmp).


>                     bool    memSTARTS_WITHs(char* s1, STRLEN l1, char* s2)
>
>     memENDS_WITHs
>             Test if the string buffer "s1" with length "l1" ends with the
>             substring given by the string literal "s2", returning non-zero
> if
>             so (including if the two are identical); zero otherwise. The
>             comparison does not include the final "NUL" of "s2". "s1" does
> not
>             have to be "NUL"-terminated,
>
>                     bool    memENDS_WITHs(char* s1, STRLEN l1, char* s2)

Do we actually have/use this? Beyond the comments above about "mem" vs
"str" I dont have any problem with this.

>
>     memFOO_STARTING_WITHs
>             Test if the string buffer "s1" with length "l1" begins with the
>             substring given by the string literal "s2", and that "s1" is
>             longer than "s2", returning non-zero if so; zero otherwise. In
>             other words, "s2" begins "s1" but is not all of "s1". The
>             comparison does not include the final "NUL" of "s2". "s1" does
> not
>             have to be "NUL"-terminated,
>
>                     bool    memFOO_STARTING_WITHs(char* s1, STRLEN l1,
>                                                   char* s2)
>
>     memFOO_ENDING_WITHs
>             Test if the string buffer "s1" with length "l1" ends with the
>             substring given by the string literal "s2", and that "s1" is
>             longer than "s2", returning non-zero if so; zero otherwise. In
>             other words, "s2" ends "s1" but is not all of "s1". The
> comparison
>             does not include the final "NUL" of "s2". "s1" does not have to
> be
>             "NUL"-terminated,
>
>                     bool    memFOO_ENDING_WITHs(char* s1, STRLEN l1,
>                                                 char* s2)

So we need something better than FOO.

Personally i would prefer to see a convention more like:

(mem|str)IS_(PREFIX|SUFFIX|EQ|NE|LT|GT|GE|LE)[ls]*

With the appropriate mix of arguments specified by the suffix.

That would mean all of the macros of the form strIS() and memIS() come
from the new convention, and everything else is historical.

So i could imagine a macro

as well as

strIS_EQ(s1,s2)
strIS_EQs(s1,s2)
strIS_EQls(s1,l1,s2)
strIS_EQl(s1,l1,s2)
strIS_EQll(s1,l1,s2,l2)

and possibly a few other permutations.

I like the idea of standardizing this stuff with conventions that well
described and predictable so if we have to add a new variant it is
well defined what it should be called.

cheers,
Yves
-- 
perl -Mre=debug -e "/just|another|perl|hacker/"

Thread Previous | Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About