develooper Front page | perl.perl5.porters | Postings from August 2001

Re: perlpod rewrite, draft 1 && perlpodspec, draft 1

Thread Previous | Thread Next
August 11, 2001 17:46
Re: perlpod rewrite, draft 1 && perlpodspec, draft 1
Message ID:

Sean M. Burke wrote:

    E<escape>   A named character (very similar to HTML escapes)
                  E<lt>         A literal <
                  E<gt>         A literal >
                  E<sol>        A literal /
                  E<verbar>     A literal |
                   (The above are optional except in other interior
                    sequences, notably L<>, and when preceded by a
                    capital letter)
                  E<0n>         ASCII character number n (octal)
                  E<n>          ASCII character number n (decimal)
                  E<0xn)        ASCII character number n (hex)
                  E<html>       Some non-numeric HTML entity, such
                                  as E<Agrave>
                (Older pod formatters might not recognize octal or
                hex numeric escapes.)

Is there any way you might be convinced that the acronym
"ASCII" is not completely accurate here?  How about
doing an edit and replacing s/ASCII/coded/g in that part?

In a later message you wrote:

A B<blank line> is a line consisting entirely of zero or more spaces
(ASCII 32) or tabs (9), and terminated by a newline or end-of-file.
A B<non-blank line> is a line containing one or more characters other
than space or tab (and terminated by a newline or end-of-file).

how about "tabs (ASCII 9)," there?

Later on you mentioned:

Characters in pod documents may be conveyed either as literals, or by
number in EE<lt>n> sequences, or by an equivalent mnemonic, as in
EE<lt>eacute> which is equivalent to EE<lt>233>.

In one of the coded character sets in which I often run pod2man the eacute
character is at code point 81, not 233.

In the next paragraph you mentioned:

Characters in the range 32-126 refer to US-ASCII characters, which
all pod formatters must render faithfully.  Characters in the ranges
0-31 and 127-159 should not be used, except for the literal sequences
for newline (13, 13 10, or 13), and tab (9).
I have seen a 10 year copy of the ANS standard for ASCII and they did
in fact include characters and code points for 0 .. 31 (BTW they called
character 10 "line feed" but there are many Unix vendors that supply ascii
man pages that incorrectly label it "NL" or "newline" or some outrageously
incorrect name adapted from the C programming language as implemented on

You then wrote:

Characters in the range 160-255 refer to Latin-1 characters (also
defined there by Unicode, with the same meaning).  Characters above
255 should be understood to refer to Unicode characters.  Be warned
that some formatters cannot reliably render anything outside 32-126;
and many are able to handle 32-126 and 160-255, but nothing above

Perhaps the pod standard is now trying to dictate which code character set
folks really should be using but you should at least be aware that Windows
codepage 1252 does not cover the Latin-1 character set and neither does
the MacRoman coded character set.  In another development it would appear
that many Unix vendors are making arrangements to switch from a default of
the ISO 8859-1 coded character set for 0..255 (in North American and
Western European locales) to the ISO 8859-15 coded character set so as to
inlcude the Euro currency symbol (at the very least).  That set in the
range 160-255 may not be called "Latin-1" (I would not be surprised if it
was called "Latin-15" but I do not follow these ISO developments too


Peter Prymmer

Thread Previous | Thread Next Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at | Group listing | About