develooper Front page | perl.perl5.porters | Postings from September 2011

Re: The future of POSIX in core

Thread Previous | Thread Next
From:
Father Chrysostomos
Date:
September 11, 2011 13:54
Subject:
Re: The future of POSIX in core
Message ID:
456C3CC2-A390-46A0-84AA-554B81E6B90A@cpan.org

On Sep 5, 2011, at 1:31 AM, Mark Overmeer wrote:

> The full text of the current state of my POSIX.pod replacement, (which
> has not been adapted to the discussions of this week) is attached here.
> Next week I hope to have time to create a bunch of patches for blead
> which do follow peoples wishes. (You can contribute textual fixes now)
> 
> Do/can we fix Perl’s number-parsing?

*I* don’t know enough about floating-point storage to do it.  Basically, large numbers end up with rounding errors that occur when the number is repeatedly multiplied by 10 as it is scanned.  I once glanced at SpiderMonkey’s number-parsing algorithm.  It starts out the same way as perl’s, but then it has several pages of unreadable code to deal with rounding errors.

I started to read through your document, suggesting changes, but then I realised that this document is probably better as a separate POSIX::Overview document. POSIX.pod could point to it. If I’m just trying to find out what the POSIX module itself does, there is a lot of information to sift through here. Maybe parts of your document (descriptions, for instance) could be in POSIX.pod, and it could be referenced from entries in POSIX::Overview.

I stopped about a third of the way through, as this document is very long, and most of my suggestions were stylistic suggestions that could be applied to the whole thing.

> =head1 NAME
> 
> POSIX - Perl interface to IEEE Std 1003.1
> 
> =head1 SYNOPSIS
> 
>     use POSIX;
>     use POSIX qw(setsid);
>     use POSIX qw(:errno_h :fcntl_h);
> 
>     printf "EINTR is %d\n", EINTR;
>     $sess_id = setsid();
> 
>     $fd = POSIX::open($path, O_CREAT|O_EXCL|O_WRONLY, 0644);
> 	# note: that's a filedescriptor, *NOT* a filehandle
> 
> =head1 DESCRIPTION
> 
> The I<Single UNIX Specification: Authorized Guide to Version 4> published
> by I<The Open Group> is a huge specification. Issue 7 (March 2010) lists
> nearly 1200 functions which are available on many platforms. A large part
> of these functions are defined in POSIX 1003.1-2008, which gives the name
> to this module.
> 
> The specification describes how C-programmers can access the system. Perl

I wouldn’t hyphenate C-programmers.

> core and some modules offer most of the POSIX functionality, although
> usually in a more powerful way.
> 
> This manual-page tries to explain where you can find these POSIX features
> in the Perl language. Besides, the module does provide implementations
> for a few dozen POSIX functions and hundreds of POSIX constants which
> are in the standard but not in Perl core.
> 
> =head2 Backwards compatibility
> 
> With Perl 5.14, this module's implementation and documentation has been

s/has/have/

> reworked considerably. In most common cases, you will receive warnings
> when importing functions which were removed because they only croaked
> anyway. You can safely remove those from your import list.

If this doc patch is to be considered separately from your other proposed changes, then that paragraph needs a rewrite.

> 
> For reason of unfortunate backwards compatibility, functions are
> I<exported by default> except when they have the same name as built-in
> Perl functions, such as C<abs>, C<alarm>, C<rmdir> and C<write>.
> You probably do not want those functions anyway, because they only
> redirect you to the core function.
> 
> This manual page does not explain what the functions do anymore, because
> the POSIX manual do a much better job.

‘manuals do’ or ‘manual does’

Someone mentioned on p5p that it’s good to document them, as different manual pages say different things.

> 
> =head2 Caveats
> 
> POSIX is a standard with a long history, deeply tied into the Operating
> System. This means that you will encounter (small) differences when
> used on different platforms. Perl core and this POSIX module makes

s/makes/make/

> attempts to hide differences, but is not always succesful.

s/is/are/

> 
> =head2 How to use this manual page
> 
> For each of the POSIX functions, we try to direct you to the best
> documentation on how to get related functionality in Perl. Even the
> most simple functions will have differences. We do not explain
> the POSIX functions because the standard POSIX manuals do a much
> better job.
> 
> Common references are to C<< POSIX.xs >>, which means that it is
> provided in XS code by this POSIX module. In all cases, when C<undef>
> is returned it means failure. In that case, C<< $! >> is set.
> 
> A reference to C<< POSIX.pm >> means that it is implemented as a perl
> subroutine by this module.  In either case, you require this module in
> your script preferably explicitly naming the functions you use.

Whether it’s POSIX.pm or POSIX.xs is an implementation detail.  I don’t think we should include that.

> 
> Often, the description will refer you to C<<perlfunc>>, which means

L<perlfunc>

(Also, double angle brackets require inner whitespace.  Your text renders in nroff as "<perlfunc">.)

> that there is a perl core function which does similar things as the
> POSIX function. On most platforms you can easily find the perlfunc
> descriptions via C<<perldoc>>. For instance:

L<perldoc>

> 
>    perldoc -f write
> 
> Another important source for Operating System facts are global variables.

Originally, all the Perl documentation used double spaces after fullstops.  pod2man still formats it that way if there is a dot at the end of a line.  So you should either do s/\. (?! )/.  /g or make sure there are no dots at the ends of lines except at the ends of paragraphs.

> Those are described in the C<<perlvar>> manual. Those variables have

L<perlvar>

> extremely short names (like C<< $$ >> for process-id) but also longer
> names (C<<$PID>> and C<<$PROCESS_ID>>). The latter are only available

Change C<<...>> to C<...> or C<< ... >> throughout.

> when you require the module C<<English>>. Names defined by the awk

L<English>

> programming language are amongst the long alternative names.
> 
> Finally, the table often refers to modules which implement comparible

comparable

> functionality. In most cases, they offer a pure Perl (so portable)
> alternative. Be warned that those modules are not always part of
> the Perl core code, so controled by other people.

Be warned that some/many of those modules are not included with Perl, but are maintained separately on CPAN.

(If you do not agree with my rewording: controlled has two l’s; ‘so’ should be ‘so they are’ or ‘and hence are’.)

> 
> =head1 FUNCTIONS
> 
> The functions are categorized as listed in chapter 3.3 of the book of
> the Open Group, with as exception those sections which only list

s/with as exception/with the exception of/

or just ‘except for’

> unsupported features.
> 
> Everything related to threads, scheduling and tracing control is not
> available in Perl.

Using ‘every’ with a ‘not’ later in the sentence usually sounds confusing to a native speaking (which I am), even if it is technically correct.

Either change ‘not available’ to ‘unavailable’, or, better, change ‘everything’ to ‘nothing’ and omit the ‘not’.

> See L<perlthrtut> on the thread implementation of
> Perl.
> 
> =over 4
> 
> =item * any posix_* function is not supported
> 
> =item * any sched_* function is not supported
> 
> =item * any pthread_* function is not supported
> 
> =item * any thread-safe function *_r is not supported

Functions beginning with posix_ are not supported, etc.

> 
> =back
> 
> =head2 Asynchronous Input and Output Interfaces
> 
> The functions are provided by M<IO::AIO>:

These functions

>   aio_cancel, aio_error, aio_fsync, aio_read, aio_return, aio_suspend,
>   aio_write
> 
> Not supported:
>   lio_listio
> 
> =head2 Jump Interfaces
> 
> These functions are used to handle error conditions. In Perl, you cannot
> handle these errors on such a low level.

s/on/at/

> 
>   longjmp perlfunc/die
>   setjmp  perlfunc/eval
> 
> =head2 Maths Library Interfaces
> 
> The accuracy of floating point numbers depends on compilation flags
> of Perl. On C level, there are many versions of these modules, but

s/On/At the/

By modules, do you mean functions?

> only one is listed in this section.
> 
> L<Math::Trig> provides pure perl implementations for all sinus,

s/sinus/sine/

> cosine and tangent functions and their hyperbolic and reverse
> variants. L<Math::Complex> provides their complex variants. In both
> modules, the names of the functions do sometimes differ slightly from

s/do //

> the POSIX names.
> 
> The modules M<Math::BigInt> and M<Math::BigFloat> offer a wide range

s/M</L</g

> of mathematical functions with flexible precission.
> 
> Provided by Perl core (see L<perlfunc>) are:

s/Perl/the Perl/
s/ are//

>   atan2           perlfunc/atan2
>   cos             perlfunc/cos
>   exp             perlfunc/exp
>   fabs            perlfunc/abs
>   isgreaterequal  perlop/>=
>   isgreater       perlop/>
>   islessequal     perlop/<=
>   islessgreater   perlop/!=
>   isless          perlop/<
>   llrint          perlfunc/int
>   log             perlfunc/log
>   lrint           perlfunc/int
>   pow             perlop/**
>   rint            perlfunc/int
>   sin             perlfunc/sin
>   sqrt            perlfunc/sqrt
> 
> Provided by POSIX.xs are
>   acos, asin, atan, ceil, cosh, floor, fmod, frexp, ldexp, log10,
>   modf, sinh, tan, tanh
> 
> Provided by L<List::Util>:
>   fmax    min
>   fmin    max
> 
> Complex numbers only in Perl wth L<Math::Complex>:
>   cabs, cacos, cacosh, carg, casin, casinh, catan, catanh, cbrt,
>   ccos, ccosh, cexp, cimag, clog, conj, cpow, cproj, creal, csin,
>   csinh, csqrt, ctan, ctanh
> 
> Only in pure Perl implementations with L<Math::Trig>
>   acosh, asinh, atanh
> 
> Probably not supported:
>   copysign, erf, erfc, exp2, expm1, fdim, fma, fpclassify, hypot,
>   ilogb, isfinite, isinf, isnan, isnormal, isunordered, lgamma,
>   llround, log1p, log2, logb, lround, nan, nearbyint, nextafter,
>   nexttoward, remainder, remquo, round, scalbln, scalbn, signbit,
>   tgamma, trunc
> 
> =head2 General ISO C Library Interfaces
> 
> =head3 convert strings to values and back
> 
> All C<strto*>, C<atof>, C<atoi> and friends functions are not needed
> in Perl:

This is what I was referring to.  Instead of saying they are not needed, perhaps you could mention that the system’s strtod sometimes provides more accurate number parsing for large numbers.

> the integers and floats are at their largest size, so when a
> string is used in numeric context it will get converted automatically.
> 
> Still, POSIX.xs does provide a few of those functions, although
> you can probably better use regular expressions to validate the
> input.  These functions should respect any locale settings.
> 
> =over 4
> 
> =item strtod
> 
> String to double translation. Returns the parsed number and the number
> of characters in the unparsed portion of the string. When called in a
> scalar context C<strtod> returns the parsed number.
> 
> =item strtol
> 
> String to integer translation. Returns the parsed number and
> the number of characters in the unparsed portion of the string.
> When called in a scalar context C<strtol> returns the parsed number.
> 
> The base should be zero or between 2 and 36, inclusive.

If the base can be any number in that range, then this function certainly *is* useful in Perl! I’ve written my own implementation of that, without realising it had already been done for me.

> When the base
> is zero or omitted C<strtol> will use the string itself to determine the
> base: a leading "0x" or "0X" means hexadecimal; a leading "0" means
> octal; any other leading characters mean decimal.  Thus, "1234" is
> parsed as a decimal number, "01234" as an octal number, and "0x1234"
> as a hexadecimal number.
> 
> =item strtoul
> 
> String to unsigned integer translation, which behaves like C<strtol>.
> 
> =back
> 
> All three provided functions treat errors the same way.  Truly
> POSIX-compliant systems set C<$ERRNO> ($!) to indicate a translation
> error, so clear C<$!> before calling strto*.  Non-compliant systems
> may not check for overflow, and therefore will never set C<$!>.
> 
> Example: to parse a string C<$str> as a floating point number use
> 
>   $! = 0;
>   ($num, $n_unparsed) = strtod($str);
> 
>   if($str eq '' || $n_unparsed != 0 || $!) {
>       die "Non-numeric input $str" . ($! ? ": $!\n" : "\n");
>   }
> 
>   # When you do not care about handling errors, you can do
>   $num = strtod($str);
>   $num = $str + 0;     # same: Perl auto-converts
> 
> =head3 String handling
> 
> Perl knows about latin1 strings are utf-8 strings.

???

> Most complications
> of character-sets are hidden for the user, as long as the user provides
> encoding details at all entry and exit points of the program. See
> L<perlunicode>, L<Encode> and L<encoding>.
> 
>   strtok         perlfunc/split
>   strcat         perlop/.=
>   strchr         perlfunc/index
>   strcmp         perlop/cmp perlop/eq
>   strcpy         perlop/=
>   strerror       perlvar/$ERRNO   "$!"
>   strlen         perlvar/length
>   strstr         perlop/index
>   tolower        perlfunc/lc      "\L$str\E"  "\l$str"
>   toupper        perlfunc/uc      "\U$str\E"  "\u$str"

One problem with listing things that way is that those will not be (and cannot be) links in HTML output. But if you don’t use a verbatim paragraph it looks bad in nroff.  I don’t know of an elegant solution.

> 
> Functions C<strcspn>, C<strpbrk>, C<strrchr>, C<strspn> can best

s/F/The f/
s/C<strspn/and C<strspn/

> be translated into regular expressions. See L<perlre>.
> 
> Functions C<strncat>, C<strncmp>, and C<strncpy> are used for various

The functions...

> purposes which let themselves usually translate into various C<substr>
> features. See L<perlfunc/substr>,
> 
> POSIX.pm provides functions named C<tolower> and C<toupper>, which
> simply call C<lc> and C<uc> respectively.
> 
> =head3 sprintf and scanf
> 
> Function C<sprintf> is provided by C<perlfunc/sprintf>, with many

The C<sprintf> function....

> extensions to the POSIX format specification.
> 
> Missing is function C<sscanf>, to be replace by regular expressions.
> This is a very different syntax. Maybe you can use M<perlfunc/unpack>
> as well.

The C<sscanf> function is not provided.  You can use regular expressions
(see L<perlre>) instead, though the syntax is very different.  See also L<perlfunc/unpack>.

> 
> Flexible versions of formatted prints and scans are not needed, so no
>   snprintf, vsnprintf, vsprintf, vsscanf
> 
> =head3 Characters
> 
> POSIX.xs provides handlers of property groups, which are affected by
> the locale setting as long as all the characters are only in one byte.

Do you mean within the byte range (what I prefer to call the octet range)?

> Please use regular expressions, which are more flexible. This table

s/Please/We recommend that you/

> shows the alternative expressions.
> 
>   isalnum    [[:alnum:]]   \p{Alnum}
>   isalpha    [[:alpha:]]   \p{Alpha}   \pL
>   isascii    [[:ascii:]]   \p{Ascii}
>   isblank    [[:blank:]]   \p{Blank}            \h
>   iscntrl    [[:cntrl:]]   \p{Control} \p{Cc}
>   isdigit    [[:digit:]]   \p{Digit}   \p{Nd}   \d
>   isgraph    [[:graph:]]   \p{Graph}
>   islower    [[:lower:]]   \p{Lower}   \p{Ll}
>   isprint    [[:print:]]   \p{Print}
>   ispunct    [[:punct:]]   \p{Punct}   \pP
>   isspace    [[:space:]]   \p{Space}
>   isupper    [[:upper:]]   \p{Upper}   \p{Lu}
>   isxdigit   [[:xdigit:]]  \p{XDigit}  \p{Hex}  [0-9a-fA-F]
>              [[:word:]]    \p{Word}             \w
> 
> C<\p{PerlSpace}> (C<\s>) is only the ASCII subdomain of C<\p{Space}>.

Not quite—\p{Space} includes \cK, which is ASCII, but \p{PerlSpace} does not include it.

> Character class C<word> is a Perl extension. There are hundreds more

The C<\p{Word}> character class....

> character classes and extensions. See L<perlunicode> and L<perluniprops>

s/props>/props>./


Thread Previous | Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About