develooper Front page | perl.perl5.porters | Postings from August 2016

Re: Encode utf8 warnings

August 16, 2016 19:21
Re: Encode utf8 warnings
Message ID:
(moving from perl-unicode ML per request)

On Saturday 13 August 2016 19:41:46 wrote:
> Hello, I see that there is one big mess in utf8 warnings for Encode.
> First, warnings should be enabled by warnings pragma. For utf8 there
> are: utf8, non_unicode, nonchar, surrogate.
> Second, warnings for Encode can be enabled by check flag Encode::FB_WARN
> or Encode::WARN_ON_ERR.
> Third, warnings for perlio :encoding layer can be enabled via
> $PerlIO::encoding::fallback variable (same flags as for Encode module).
> And here is problem:
> Should Encode utf8 throw warnings if:
> * utf8 warnings are enabled by pragma, but not enabled via Encode
>   check flags?
> * utf8 pragma warnings are disabled, but Encode WARN_ON_ERR bit enabled?
> * utf8 pragma warnings are disabled and $PerlIO::encoding::fallback
>   variable was not modified?
> There are couple of bugs and comments about this problem:
> I think that we need to declare how should utf8 pragma warnings
> interference with Encode WARN_ON_ERR for Unicode encodings
> (Encode::utf8 and Encode::Unicode).
> Documentation:

My point of view:

Encode module is generic and not only UTF-8 related. Warning are enabled
only if Encode::FB_WARN or Encode::WARN_ON_ERR check value is passed.

So I would expect that warning utf8 pragma does not interference with
Encode::encode/decode functions by default (when called without check
flags or with Encode::FB_DEFAULT).

On the other hand when reading from file handle with :utf8 layer,
warnings are managed by warning utf8 pragma. And for me it make sense
that this should apply also for :encoding layer with UTF-8 (or other
Unicode) encodings.

Currently core module PerlIO::encoding enable warnings by default for
:encoding layer and those warnings can be controlled only via variable
$PerlIO::encoding::fallback. Not via warnings utf8 pragma.

My proposal:

Add new check flag to Encode module which will follow utf8 pragma
warnings state (e.g Encode::FOLLOW_UTF8_PRAGMA_WARN, you can invent
better name). And add this flag to PerlIO::encoding module by default.

Then patch Encode::utf8 and Encode::Unicode modules (part of Encode
package) to use that new (FOLLOW_UTF8_PRAGMA_WARN) flag. That means to
show unicode warning if either one of this condition is true:
* FOLLOW_UTF8_PRAGMA_WARN is not set && WARN_ON_ERR is set
* FOLLOW_UTF8_PRAGMA_WARN is set && utf8 pragma warning is enabled

That would means that utf8 warnings would be shown if WARN_ON_ERR is
passed to Encode, FOLLOW_UTF8_PRAGMA_WARN is not passed and utf8 pragma
warning is disabled.

What do you think about it? Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at | Group listing | About