develooper Front page | perl.perl5.porters | Postings from July 2021

Re: Pre-RFC: Width-aware (s)printf flag

Thread Previous | Thread Next
From:
Dan Book
Date:
July 1, 2021 16:52
Subject:
Re: Pre-RFC: Width-aware (s)printf flag
Message ID:
CABMkAVVJOBD9nzoaRm7kargZ+yxDcnjKk0B2X7=W7upF93Yvmg@mail.gmail.com
On Thu, Jul 1, 2021 at 12:46 PM Paul "LeoNerd" Evans <leonerd@leonerd.org.uk>
wrote:

> Consider
>
>   printf "%-40s : %s\n", $_->@* for @rows;
>
> The intention is to print a nice neat table on the terminal.
>
> This works fine in ASCII but gets all confused if any ->[0] element
> contains Unicode text. While Perl will count in Unicode codepoints,
> this won't help if there are combining chars (because combining chars
> count as codepoints but do not consume terminal columns), or if there
> are any emoji or other double-width characters (because these single
> graphemes count as two columns).
>
> I propose a new printf flag, perhaps `|`, to tell (s)printf to count
> these strings by terminal width instead. Thus
>
>   printf "%-|40s : %s\n", $_->@* for @rows;
>
> would now print a neat table even in the presence of Weird Unicode.
>

As mentioned on IRC, I think it would be nice to have more grapheme-aware
capability in core; right now the only grapheme-aware functionality I know
of is the \X regular expression matcher which matches a single grapheme
(and more manual stuff using Unicode::UCD).

There is one potential problem here: you normally need to encode characters
to bytes in order to print them. The grapheme determination would need to
happen before encoding. This would work out if you're printing to a handle
with an encoding layer, but probably cause confusion in the usual case.

-Dan

Thread Previous | Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About