Front page | perl.perl5.porters |
Postings from August 2016
Re: Encode utf8 warnings
From:
pali
Date:
August 16, 2016 19:21
Subject:
Re: Encode utf8 warnings
Message ID:
201608151939.03994@pali
(moving from perl-unicode ML per request)
On Saturday 13 August 2016 19:41:46 pali@cpan.org wrote:
> Hello, I see that there is one big mess in utf8 warnings for Encode.
>
> First, warnings should be enabled by warnings pragma. For utf8 there
> are: utf8, non_unicode, nonchar, surrogate.
>
> Second, warnings for Encode can be enabled by check flag Encode::FB_WARN
> or Encode::WARN_ON_ERR.
>
> Third, warnings for perlio :encoding layer can be enabled via
> $PerlIO::encoding::fallback variable (same flags as for Encode module).
>
> And here is problem:
>
> Should Encode utf8 throw warnings if:
>
> * utf8 warnings are enabled by pragma, but not enabled via Encode
> check flags?
> * utf8 pragma warnings are disabled, but Encode WARN_ON_ERR bit enabled?
> * utf8 pragma warnings are disabled and $PerlIO::encoding::fallback
> variable was not modified?
>
> There are couple of bugs and comments about this problem:
> https://rt.cpan.org/Public/Bug/Display.html?id=88592
> https://github.com/dankogai/p5-encode/pull/26#issuecomment-235641347
> https://rt.perl.org/Public/Bug/Display.html?id=128788
> https://rt.cpan.org/Public/Bug/Display.html?id=116629
> https://github.com/dankogai/p5-encode/issues/59
> https://github.com/dankogai/p5-encode/commit/a6c2ba385875c2c03bd42350e23aef0188fb23b0
> https://github.com/dankogai/p5-encode/commit/07c8adb58e55c7cf66b3d6673bf50010fe1a69ea
>
> I think that we need to declare how should utf8 pragma warnings
> interference with Encode WARN_ON_ERR for Unicode encodings
> (Encode::utf8 and Encode::Unicode).
>
> Documentation:
> https://metacpan.org/pod/warnings
> https://metacpan.org/pod/Encode#FB_WARN
> https://metacpan.org/pod/PerlIO::encoding
My point of view:
Encode module is generic and not only UTF-8 related. Warning are enabled
only if Encode::FB_WARN or Encode::WARN_ON_ERR check value is passed.
So I would expect that warning utf8 pragma does not interference with
Encode::encode/decode functions by default (when called without check
flags or with Encode::FB_DEFAULT).
On the other hand when reading from file handle with :utf8 layer,
warnings are managed by warning utf8 pragma. And for me it make sense
that this should apply also for :encoding layer with UTF-8 (or other
Unicode) encodings.
Currently core module PerlIO::encoding enable warnings by default for
:encoding layer and those warnings can be controlled only via variable
$PerlIO::encoding::fallback. Not via warnings utf8 pragma.
My proposal:
Add new check flag to Encode module which will follow utf8 pragma
warnings state (e.g Encode::FOLLOW_UTF8_PRAGMA_WARN, you can invent
better name). And add this flag to PerlIO::encoding module by default.
Then patch Encode::utf8 and Encode::Unicode modules (part of Encode
package) to use that new (FOLLOW_UTF8_PRAGMA_WARN) flag. That means to
show unicode warning if either one of this condition is true:
* FOLLOW_UTF8_PRAGMA_WARN is not set && WARN_ON_ERR is set
* FOLLOW_UTF8_PRAGMA_WARN is set && utf8 pragma warning is enabled
That would means that utf8 warnings would be shown if WARN_ON_ERR is
passed to Encode, FOLLOW_UTF8_PRAGMA_WARN is not passed and utf8 pragma
warning is disabled.
What do you think about it?