develooper Front page | perl.perl5.porters | Postings from June 2022

Re: Pre-RFC: New C API for converting from UTF-8 to code point

Thread Previous | Thread Next
Tony Cook
June 29, 2022 01:55
Re: Pre-RFC: New C API for converting from UTF-8 to code point
Message ID:
On Tue, Jun 28, 2022 at 09:17:50AM -0600, Karl Williamson wrote:
> In response to GH #19897 and GH #19842, I think we need to come up with a
> better API to replace the deprecated functions.
> One of the issues with the existing API is that the behavior changes
> depending on whether warnings are enabled or not; something usually outside
> the purview of a module author.  There's also the problem in some cases of
> having to disambiguate the return being successful or not.

Are these intended to replace just the deprecated functions, or also
as an easier to use version of utf8n_to_uvchr() and its variants?

It might be worth describing how the new APIs differ from the existing
non-deprecated APIs.  The obvious difference is start/end vs
start/length, but I think error reporting is handled differently too.

> The program would not have to concern itself with malformed input; the
> function would take care of that by itself, returning REPLACEMENT CHARACTER
> for each malformed sequence, and setting retlen to be the offset of the
> starting position of the next potentially legal character.  If utf8 warnings
> are on, those would be raised for each iteration that found a malformation.

Would there be a simple way to prevent this API producing warnings?


Thread Previous | Thread Next Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at | Group listing | About