develooper Front page | perl.perl5.porters | Postings from June 2013

Re: [perl #113824] Regexp error messages are not UTF8-clean

Thread Previous | Thread Next
Leon Timmermans
June 18, 2013 10:07
Re: [perl #113824] Regexp error messages are not UTF8-clean
Message ID:
On Tue, Jun 18, 2013 at 11:53 AM, demerphq <> wrote:
> This is a Perl API fail. I do not see how it can be fixed without
> grevious trauma. Apparently much of our internal error message
> handling code is not UTF8 safe.
> See the code for vFAIL() in regcomp.c which calls Perl_croak() which
> calls vcroak().
> The interface for Perl_croak() and friends do not support UTF8 at all.
> They accept only a char* pointer, and have no facility for a UTF8
> flag.
> We could fix the direct problem by rewriting all the code in the regex
> engine which uses UTF8, but imo that is just a bandage. The real
> problem is our core API's were never modernized to work properly with
> Unicode.
> IMO, this ticket should be closed as a "won't fix", or merged with a
> ticket which relates to our internal error reporting API's lacking
> proper Unicode support and fixed as part of resolving THAT ticket.
> Also IMO, if we want to really fix this stuff we should just bite the
> bullet, deprecate ALL of the char * only internal API's and switch to
> something that ALWAYS includes a utf8 flag. Across the board.

You can croak an SV actually. It's ugly (setting $@ to the error and
then croaking NULL), but possible.


Thread Previous | Thread Next Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at | Group listing | About