> On Oct 4, 2021, at 4:45 AM, Yuki Kimoto <kimoto.yuki@gmail.com> wrote: > > > 2021-10-4 3:57 Ricardo Signes <perl.p5p@rjbs.manxome.org> wrote: > > ONE: What's the end state we'd like to get to? > > > I have a question. > > echo -e '1' | perl -p -E 's/\d/1/' > > '1' of echo argument is Japanese UTF-8. Output is ASCII 1. > > Current Output(UTF-8 1) > > 1 > > Ideal Output(ASCII 1) > > 1 > > Do you want this to work ideally in the UNIX/Linux system? For that to happen you would pass the `-CIO` flag to perl, which causes STDIN & STDOUT to automatically decode/encode UTF-8. The one-liner as-is outputs "\xef\xbc\x91" (U+FF11 in UTF-8) instead of ASCII 1 because those 3 bytes are what Perl receives on STDIN, and nothing is decoding those to U+FF11. Your s/\d/1/ only works on *digits*, and none of U+00EF, U+00BC, or U+0091 is. So no change happens. -FGThread Previous | Thread Next