On 11/20/2016 11:03 AM, Karl Williamson wrote: > On 11/20/2016 08:20 AM, Aaron Crane wrote: I think the patch should be committed. >> Sawyer X <xsawyerx@gmail.com> wrote: >>> On 10/30/2016 07:10 PM, Aristotle Pagaltzis wrote: >>>> I would prefer to see this just fixed, for everyone, with cleaner code. >>>> And it’s very *likely* that that can be done… just not *known*. A cycle >>>> or two with warnings would give us data to calibrate the guess. >>> >>> Again, I'm not necessarily against that. I'm trying to add more >>> considerations here. Perhaps the feature is the right place for it, >>> using "unicode_strings". >> >> On the assumption that a concrete change is easier to reason about >> than the abstract situation, I attach a proposed patch for the Unicode >> Bug in the range operator. >> >> The patch itself is fairly straightforward; its guts look like this: >> >> --- a/pp_ctl.c >> +++ b/pp_ctl.c >> @@ -1222,6 +1222,8 @@ PP(pp_flop) >> const char * const tmps = SvPV_nomg_const(right, len); >> >> SV *sv = newSVpvn_flags(lpv, llen, SvUTF8(left)|SVs_TEMP); >> + if (DO_UTF8(right) && IN_UNI_8_BIT) >> + len = sv_len_utf8_nomg(right); >> while (!SvNIOKp(sv) && SvCUR(sv) <= len) { >> XPUSHs(sv); >> if (strEQ(SvPVX_const(sv),tmps)) >> >> (Except twice, because "foreach ($x .. $y)" has an independent >> implementation that takes constant memory.) >> >> That is, this change makes stringy $x..$y honour the unicode_strings >> feature, without any warning. >> >> FWIW, my own view is that this change is simply a bugfix for ranges >> under the unicode_strings feature, and that the current behaviour is >> so bizarre and unpredictable that no warning is necessary. (Or even >> entirely useful, since we can't distinguish between code that wants >> the current behaviour (but neglected to utf8::decode the RHS) and code >> that's been updated to take advantage of the new behaviour.) >> > > As I believe it has been pointed out before, the use of that feature > implies that the user wants proper handling of unicode strings. That is > why in earlier releases, it was enhanced to include more things, like > quotemeta as they were unearthed, instead of creating extra features. > Thus, treating this as a bug fix follows the existing paradigm that we > followed. > >Thread Previous | Thread Next