develooper Front page | perl.perl5.porters | Postings from June 2010

[perl #75574] PATCH: change names of two variables in regexec.c

Thread Next
karl williamson
June 7, 2010 02:17
[perl #75574] PATCH: change names of two variables in regexec.c
Message ID:
# New Ticket Created by  karl williamson 
# Please include the string:  [perl #75574]
# in the subject line of all future correspondence about this issue. 
# <URL: >

My recent accepted patch that added a synonym for a subroutine that I 
find much more informative, has led me to try doing the same for the 
other two names that I find especially confusing, and apparently they 
have been to others as well.

I always thought the Sapir-Whorf hypothesis made a lot of sense, even 
though I was told when I studied it in college that it was discredited. 
  (My daughter, who has a degree in linguistics, tells me that it is 
back in favor, and a quick look at wikipedia confirms that.)

Anyway, I have found a few bugs so far in the code that have the same 
root cause: the failure to realize that when you have two string-like 
entities, that either one or both may be in UTF8, which leads to 4 
possibilities always.  Often the the code fails to take into account one 
of those possibilities.

This bug is in regexec.c, and I wonder how prevalent it is there.  In 
this file there is a pattern and a target to match against, and the 4 
possibilities are always there.  But the variable meaning the pattern is 
in UTF8 is 'UTF', and the variable meaning the target string is in UTF8 
is 'do_utf8'.  The bugs I've found stem from forgetting that the pattern 
can be in UTF8 without the variable being so, and the very name 
'do_utf8' which applies only to the target seems to me to lead one down 
this incorrect path.

It was an easy patch to change UTF to UTF_PATTERN and do_utf8 to 
utf8_target, and will help me remember as I scan the code, and hopefully 
others as well, to always be cognizant of the 4 possibilities.

Thread Next Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at | Group listing | About