develooper Front page | perl.perl5.porters | Postings from May 2014

Re: RFC: Making -B and -T work better on 8-bit encodings

Thread Previous | Thread Next
May 10, 2014 21:50
Re: RFC: Making -B and -T work better on 8-bit encodings
Message ID:
Karl Williamson wrote:
> It seems to me that by lowering the ratio so that greater than about 
> 15-20% non-text cause the file to be classified as binary, while 
> expanding the text characters by the 95 upper Latin1 printable 
> characters (except for \xFF) will give good results, better than the 
> existing.

Why do we use percent cutoffs in the first place? Either it is 
printable/glyphable or its not. perlfunc does document the %s behavior, 
so I would guess ANY change to the algorithm would break backcompat for 
the few people willing to use such an unreliable algo. I would suggest 
to leave it alone as a backcompat/legacy/obsolete feature, or deprecate 
and remove -T/-B and tell people to use CPAN/something smarter for their 
specific purpose.

Being purely printable doesn't mean a string/data is risk free but a 
fixed set of rules is better than a % "guess".

Thread Previous | Thread Next Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at | Group listing | About