develooper Front page | perl.perl5.porters | Postings from January 2001

Re: more UTF8 test suites and an UTF8 patch

Thread Previous
Inaba Hiroto
January 8, 2001 06:30
Re: more UTF8 test suites and an UTF8 patch
Message ID:
I'm sorry for late reply.

Jarkko Hietaniemi wrote:

> > They are converted from t/op/{subst.t,substr.t,regexp.t,re_tests}
> > simply translating ascii characters to unicode characters.  (In fact,
> > they are "FULLWIDTH" characters code FF01-FF5E)
> I don't like the idea in future of having to always remember to patch
> the _utf8.t versions if someone patches the non-utf8 versions (or vice
> versa).

Yes, I agree.

> Would it be somehow possible to automate the task so that there would
> be some sort of template files from which both the byte and 'wide
> character' version would be automagically produced?  (The template
> could of course be the byte version, to save space)

Now I'm working to make such template files for
and its UTF-8 version.

> > pp_ctl.c:
> >   In pp_regcomp(), use PMdf_DYN_UTF8 flag to set pm->op_pmdynflags
> >   instead of PMdf_UTF8 flag.
> If you have formed some sort of clear idea of the various UTF8 flags
> (what each one is doing), please feel free to document them somewhere.

I suppose PMdf_UTF8 flag means the regexp contains UTF8 data at script
compile time.
And PMdf_DYN_UTF8 flag (I introduced) means dynamicaly interpolated string
is UTF8.

> > toke.c:
> >   In scan_const(), change `\x{...}' parsing logic.
> While you are at it, could you change [\x{80}-\x{ff}] to produce/match
> (string constants / regexes) bytes, not UTF-8 characters?  This way
> it would be internally consistent with chr() and vstrings.

I think we can. (Though t/op/length.t test 7 assumes current behavior)

> > Lastly, the new pragma I would like to propose in the patch is,
> >
> > lib/
> >   `distinct' is a pragma to strictly distinguish UTF8 data and
> >   non-UTF data.

> Ummmm.  Introducing new pragmas should be considered carefully.


> Can you give an example of how to use this pragma?  What problem
> does it solve?

Actually, I have no real problem to solve with this pragma.
I'll send a separate mail for this topic.
    Inaba Hiroto    <>

Thread Previous Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at | Group listing | About