develooper Front page | perl.perl5.porters | Postings from October 2012

charclass_invlists.h declarations intentionally global?

Thread Next
Craig A. Berry
October 25, 2012 16:24
charclass_invlists.h declarations intentionally global?
Message ID:
charclass_invlists.h has a bunch of declarations that look like, for example:

UV AboveLatin1_invlist[] = {
        1,      /* Number of elements */
        0,      /* Current iteration position */
        0,      /* Cache of previous search index result */
        290655244, /* Version and data structure type */
        1,      /* 0 if this is the first element of the list proper;
                   1 if the next element is the first */

These declarations have file scope in regcomp.c (which doubles as ext/re/re_comp.c) because that's the level at which charclass_invlists.h is included.  

C99, section 6.2.2 item 5 says, "If the declaration of an identifier for a function has no storage-class specifier, its linkage is determined exactly as if it were declared with the storage-class specifier extern.  If the declaration of an identifier for an object has file scope and no storage-class specifier, its linkage is external."

So these symbols default to extern, and on platforms that export everything, they end up externally visible in libperl:

% nm -go 'libperl.a(regcomp.o)' | grep AboveLatin1_invlist
libperl.a:regcomp.o: 00000000000229e0 D _AboveLatin1_invlist

as well as in the dynamic library for the re extension:

% nm lib/auto/re/re.bundle | grep AboveLatin1_invlist
0000000000057ca0 D _AboveLatin1_invlist

And if I'm reading the nm man page correctly, the upper case "D" indicates that these are in the data section and external.  Which means we've got two global symbols of the same name but at different addresses, one in the perl binary proper and one in the extension.  Which a linker could legitimately complain about (and the one on OpenVMS VAX does).  Plus global data that we don't mean to be global is generally frowned upon.

So unless there is some plan in mid-stream to use these symbols outside of regcomp.c/re_comp.c, I think the right thing to do is to simply make all those declarations static, because, again quoting C99, "If the declaration of a file scope identifier for an object or a function contains the storage- class specifier static, the identifier has internal linkage."  The attached patch does that.  Any objections?

Craig A. Berry

"... getting out of a sonnet is much more
 difficult than getting in."
                 Brad Leithauser

Thread Next Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at | Group listing | About