On Mon, Aug 2, 2021 at 11:17 AM Veesh Goldman <rabbiveesh@gmail.com> wrote: > > >> >> My point is still that this: >> >> ----- >> use v5.36; >> print 'Hello, world!'; >> ----- >> >> ⦠should not be âsubtly wrongâ. >> >> -F > > > Since 5.36 is meant to turn on warnings, this will be explicitly wrong, > not subtly. > > Perhaps the "wide character" warning is too unclear, but we can always > improve the text to include a doc link as such. > > What compels me more is the following example. > Let's say I'm looking for customers in my database named josé. Easy, I'll > use DBIC: > > $customer_rs->search({ name => 'josé' }) > > But when I run it, I get nothing. That's because the various DBDs will > handle encoding and decoding for you, bc perl is meant to deal with text in > userland. > > Had utf8 been turned on, then I would've started with text, not bytes, and > found my customers instead of mojibake (though on the other hand, the non > utf8 is a great way to find double encoded text). > > I think this is a more realistic example than printing a string literal, > where the behavior is surprising and conceptually inconsistent. > Yes, this is a tradeoff between interfaces that will expect bytes and interfaces that will expect characters, as both exist in modern Perl. STDOUT and STDERR expect bytes unless one does "use open ':std', IO => ':encoding(UTF-8)';" which changes the assumption of those interfaces so isn't great. DBI drivers, Mojolicious interfaces, etc expect characters. I think it is both true that having "use utf8" in use VERSION will surprise people, and not having it in use VERSION will continue to surprise people. I think we can make this step with proper documentation, but we must understand the concerns Felipe mentions are real. -DanThread Previous | Thread Next