Front page | perl.perl5.porters |
Postings from September 2011
Re: The future of POSIX in core
Thread Previous
|
Thread Next
From:
Father Chrysostomos
Date:
September 11, 2011 13:54
Subject:
Re: The future of POSIX in core
Message ID:
456C3CC2-A390-46A0-84AA-554B81E6B90A@cpan.org
On Sep 5, 2011, at 1:31 AM, Mark Overmeer wrote:
> The full text of the current state of my POSIX.pod replacement, (which
> has not been adapted to the discussions of this week) is attached here.
> Next week I hope to have time to create a bunch of patches for blead
> which do follow peoples wishes. (You can contribute textual fixes now)
>
> Do/can we fix Perl’s number-parsing?
*I* don’t know enough about floating-point storage to do it. Basically, large numbers end up with rounding errors that occur when the number is repeatedly multiplied by 10 as it is scanned. I once glanced at SpiderMonkey’s number-parsing algorithm. It starts out the same way as perl’s, but then it has several pages of unreadable code to deal with rounding errors.
I started to read through your document, suggesting changes, but then I realised that this document is probably better as a separate POSIX::Overview document. POSIX.pod could point to it. If I’m just trying to find out what the POSIX module itself does, there is a lot of information to sift through here. Maybe parts of your document (descriptions, for instance) could be in POSIX.pod, and it could be referenced from entries in POSIX::Overview.
I stopped about a third of the way through, as this document is very long, and most of my suggestions were stylistic suggestions that could be applied to the whole thing.
> =head1 NAME
>
> POSIX - Perl interface to IEEE Std 1003.1
>
> =head1 SYNOPSIS
>
> use POSIX;
> use POSIX qw(setsid);
> use POSIX qw(:errno_h :fcntl_h);
>
> printf "EINTR is %d\n", EINTR;
> $sess_id = setsid();
>
> $fd = POSIX::open($path, O_CREAT|O_EXCL|O_WRONLY, 0644);
> # note: that's a filedescriptor, *NOT* a filehandle
>
> =head1 DESCRIPTION
>
> The I<Single UNIX Specification: Authorized Guide to Version 4> published
> by I<The Open Group> is a huge specification. Issue 7 (March 2010) lists
> nearly 1200 functions which are available on many platforms. A large part
> of these functions are defined in POSIX 1003.1-2008, which gives the name
> to this module.
>
> The specification describes how C-programmers can access the system. Perl
I wouldn’t hyphenate C-programmers.
> core and some modules offer most of the POSIX functionality, although
> usually in a more powerful way.
>
> This manual-page tries to explain where you can find these POSIX features
> in the Perl language. Besides, the module does provide implementations
> for a few dozen POSIX functions and hundreds of POSIX constants which
> are in the standard but not in Perl core.
>
> =head2 Backwards compatibility
>
> With Perl 5.14, this module's implementation and documentation has been
s/has/have/
> reworked considerably. In most common cases, you will receive warnings
> when importing functions which were removed because they only croaked
> anyway. You can safely remove those from your import list.
If this doc patch is to be considered separately from your other proposed changes, then that paragraph needs a rewrite.
>
> For reason of unfortunate backwards compatibility, functions are
> I<exported by default> except when they have the same name as built-in
> Perl functions, such as C<abs>, C<alarm>, C<rmdir> and C<write>.
> You probably do not want those functions anyway, because they only
> redirect you to the core function.
>
> This manual page does not explain what the functions do anymore, because
> the POSIX manual do a much better job.
‘manuals do’ or ‘manual does’
Someone mentioned on p5p that it’s good to document them, as different manual pages say different things.
>
> =head2 Caveats
>
> POSIX is a standard with a long history, deeply tied into the Operating
> System. This means that you will encounter (small) differences when
> used on different platforms. Perl core and this POSIX module makes
s/makes/make/
> attempts to hide differences, but is not always succesful.
s/is/are/
>
> =head2 How to use this manual page
>
> For each of the POSIX functions, we try to direct you to the best
> documentation on how to get related functionality in Perl. Even the
> most simple functions will have differences. We do not explain
> the POSIX functions because the standard POSIX manuals do a much
> better job.
>
> Common references are to C<< POSIX.xs >>, which means that it is
> provided in XS code by this POSIX module. In all cases, when C<undef>
> is returned it means failure. In that case, C<< $! >> is set.
>
> A reference to C<< POSIX.pm >> means that it is implemented as a perl
> subroutine by this module. In either case, you require this module in
> your script preferably explicitly naming the functions you use.
Whether it’s POSIX.pm or POSIX.xs is an implementation detail. I don’t think we should include that.
>
> Often, the description will refer you to C<<perlfunc>>, which means
L<perlfunc>
(Also, double angle brackets require inner whitespace. Your text renders in nroff as "<perlfunc">.)
> that there is a perl core function which does similar things as the
> POSIX function. On most platforms you can easily find the perlfunc
> descriptions via C<<perldoc>>. For instance:
L<perldoc>
>
> perldoc -f write
>
> Another important source for Operating System facts are global variables.
Originally, all the Perl documentation used double spaces after fullstops. pod2man still formats it that way if there is a dot at the end of a line. So you should either do s/\. (?! )/. /g or make sure there are no dots at the ends of lines except at the ends of paragraphs.
> Those are described in the C<<perlvar>> manual. Those variables have
L<perlvar>
> extremely short names (like C<< $$ >> for process-id) but also longer
> names (C<<$PID>> and C<<$PROCESS_ID>>). The latter are only available
Change C<<...>> to C<...> or C<< ... >> throughout.
> when you require the module C<<English>>. Names defined by the awk
L<English>
> programming language are amongst the long alternative names.
>
> Finally, the table often refers to modules which implement comparible
comparable
> functionality. In most cases, they offer a pure Perl (so portable)
> alternative. Be warned that those modules are not always part of
> the Perl core code, so controled by other people.
Be warned that some/many of those modules are not included with Perl, but are maintained separately on CPAN.
(If you do not agree with my rewording: controlled has two l’s; ‘so’ should be ‘so they are’ or ‘and hence are’.)
>
> =head1 FUNCTIONS
>
> The functions are categorized as listed in chapter 3.3 of the book of
> the Open Group, with as exception those sections which only list
s/with as exception/with the exception of/
or just ‘except for’
> unsupported features.
>
> Everything related to threads, scheduling and tracing control is not
> available in Perl.
Using ‘every’ with a ‘not’ later in the sentence usually sounds confusing to a native speaking (which I am), even if it is technically correct.
Either change ‘not available’ to ‘unavailable’, or, better, change ‘everything’ to ‘nothing’ and omit the ‘not’.
> See L<perlthrtut> on the thread implementation of
> Perl.
>
> =over 4
>
> =item * any posix_* function is not supported
>
> =item * any sched_* function is not supported
>
> =item * any pthread_* function is not supported
>
> =item * any thread-safe function *_r is not supported
Functions beginning with posix_ are not supported, etc.
>
> =back
>
> =head2 Asynchronous Input and Output Interfaces
>
> The functions are provided by M<IO::AIO>:
These functions
> aio_cancel, aio_error, aio_fsync, aio_read, aio_return, aio_suspend,
> aio_write
>
> Not supported:
> lio_listio
>
> =head2 Jump Interfaces
>
> These functions are used to handle error conditions. In Perl, you cannot
> handle these errors on such a low level.
s/on/at/
>
> longjmp perlfunc/die
> setjmp perlfunc/eval
>
> =head2 Maths Library Interfaces
>
> The accuracy of floating point numbers depends on compilation flags
> of Perl. On C level, there are many versions of these modules, but
s/On/At the/
By modules, do you mean functions?
> only one is listed in this section.
>
> L<Math::Trig> provides pure perl implementations for all sinus,
s/sinus/sine/
> cosine and tangent functions and their hyperbolic and reverse
> variants. L<Math::Complex> provides their complex variants. In both
> modules, the names of the functions do sometimes differ slightly from
s/do //
> the POSIX names.
>
> The modules M<Math::BigInt> and M<Math::BigFloat> offer a wide range
s/M</L</g
> of mathematical functions with flexible precission.
>
> Provided by Perl core (see L<perlfunc>) are:
s/Perl/the Perl/
s/ are//
> atan2 perlfunc/atan2
> cos perlfunc/cos
> exp perlfunc/exp
> fabs perlfunc/abs
> isgreaterequal perlop/>=
> isgreater perlop/>
> islessequal perlop/<=
> islessgreater perlop/!=
> isless perlop/<
> llrint perlfunc/int
> log perlfunc/log
> lrint perlfunc/int
> pow perlop/**
> rint perlfunc/int
> sin perlfunc/sin
> sqrt perlfunc/sqrt
>
> Provided by POSIX.xs are
> acos, asin, atan, ceil, cosh, floor, fmod, frexp, ldexp, log10,
> modf, sinh, tan, tanh
>
> Provided by L<List::Util>:
> fmax min
> fmin max
>
> Complex numbers only in Perl wth L<Math::Complex>:
> cabs, cacos, cacosh, carg, casin, casinh, catan, catanh, cbrt,
> ccos, ccosh, cexp, cimag, clog, conj, cpow, cproj, creal, csin,
> csinh, csqrt, ctan, ctanh
>
> Only in pure Perl implementations with L<Math::Trig>
> acosh, asinh, atanh
>
> Probably not supported:
> copysign, erf, erfc, exp2, expm1, fdim, fma, fpclassify, hypot,
> ilogb, isfinite, isinf, isnan, isnormal, isunordered, lgamma,
> llround, log1p, log2, logb, lround, nan, nearbyint, nextafter,
> nexttoward, remainder, remquo, round, scalbln, scalbn, signbit,
> tgamma, trunc
>
> =head2 General ISO C Library Interfaces
>
> =head3 convert strings to values and back
>
> All C<strto*>, C<atof>, C<atoi> and friends functions are not needed
> in Perl:
This is what I was referring to. Instead of saying they are not needed, perhaps you could mention that the system’s strtod sometimes provides more accurate number parsing for large numbers.
> the integers and floats are at their largest size, so when a
> string is used in numeric context it will get converted automatically.
>
> Still, POSIX.xs does provide a few of those functions, although
> you can probably better use regular expressions to validate the
> input. These functions should respect any locale settings.
>
> =over 4
>
> =item strtod
>
> String to double translation. Returns the parsed number and the number
> of characters in the unparsed portion of the string. When called in a
> scalar context C<strtod> returns the parsed number.
>
> =item strtol
>
> String to integer translation. Returns the parsed number and
> the number of characters in the unparsed portion of the string.
> When called in a scalar context C<strtol> returns the parsed number.
>
> The base should be zero or between 2 and 36, inclusive.
If the base can be any number in that range, then this function certainly *is* useful in Perl! I’ve written my own implementation of that, without realising it had already been done for me.
> When the base
> is zero or omitted C<strtol> will use the string itself to determine the
> base: a leading "0x" or "0X" means hexadecimal; a leading "0" means
> octal; any other leading characters mean decimal. Thus, "1234" is
> parsed as a decimal number, "01234" as an octal number, and "0x1234"
> as a hexadecimal number.
>
> =item strtoul
>
> String to unsigned integer translation, which behaves like C<strtol>.
>
> =back
>
> All three provided functions treat errors the same way. Truly
> POSIX-compliant systems set C<$ERRNO> ($!) to indicate a translation
> error, so clear C<$!> before calling strto*. Non-compliant systems
> may not check for overflow, and therefore will never set C<$!>.
>
> Example: to parse a string C<$str> as a floating point number use
>
> $! = 0;
> ($num, $n_unparsed) = strtod($str);
>
> if($str eq '' || $n_unparsed != 0 || $!) {
> die "Non-numeric input $str" . ($! ? ": $!\n" : "\n");
> }
>
> # When you do not care about handling errors, you can do
> $num = strtod($str);
> $num = $str + 0; # same: Perl auto-converts
>
> =head3 String handling
>
> Perl knows about latin1 strings are utf-8 strings.
???
> Most complications
> of character-sets are hidden for the user, as long as the user provides
> encoding details at all entry and exit points of the program. See
> L<perlunicode>, L<Encode> and L<encoding>.
>
> strtok perlfunc/split
> strcat perlop/.=
> strchr perlfunc/index
> strcmp perlop/cmp perlop/eq
> strcpy perlop/=
> strerror perlvar/$ERRNO "$!"
> strlen perlvar/length
> strstr perlop/index
> tolower perlfunc/lc "\L$str\E" "\l$str"
> toupper perlfunc/uc "\U$str\E" "\u$str"
One problem with listing things that way is that those will not be (and cannot be) links in HTML output. But if you don’t use a verbatim paragraph it looks bad in nroff. I don’t know of an elegant solution.
>
> Functions C<strcspn>, C<strpbrk>, C<strrchr>, C<strspn> can best
s/F/The f/
s/C<strspn/and C<strspn/
> be translated into regular expressions. See L<perlre>.
>
> Functions C<strncat>, C<strncmp>, and C<strncpy> are used for various
The functions...
> purposes which let themselves usually translate into various C<substr>
> features. See L<perlfunc/substr>,
>
> POSIX.pm provides functions named C<tolower> and C<toupper>, which
> simply call C<lc> and C<uc> respectively.
>
> =head3 sprintf and scanf
>
> Function C<sprintf> is provided by C<perlfunc/sprintf>, with many
The C<sprintf> function....
> extensions to the POSIX format specification.
>
> Missing is function C<sscanf>, to be replace by regular expressions.
> This is a very different syntax. Maybe you can use M<perlfunc/unpack>
> as well.
The C<sscanf> function is not provided. You can use regular expressions
(see L<perlre>) instead, though the syntax is very different. See also L<perlfunc/unpack>.
>
> Flexible versions of formatted prints and scans are not needed, so no
> snprintf, vsnprintf, vsprintf, vsscanf
>
> =head3 Characters
>
> POSIX.xs provides handlers of property groups, which are affected by
> the locale setting as long as all the characters are only in one byte.
Do you mean within the byte range (what I prefer to call the octet range)?
> Please use regular expressions, which are more flexible. This table
s/Please/We recommend that you/
> shows the alternative expressions.
>
> isalnum [[:alnum:]] \p{Alnum}
> isalpha [[:alpha:]] \p{Alpha} \pL
> isascii [[:ascii:]] \p{Ascii}
> isblank [[:blank:]] \p{Blank} \h
> iscntrl [[:cntrl:]] \p{Control} \p{Cc}
> isdigit [[:digit:]] \p{Digit} \p{Nd} \d
> isgraph [[:graph:]] \p{Graph}
> islower [[:lower:]] \p{Lower} \p{Ll}
> isprint [[:print:]] \p{Print}
> ispunct [[:punct:]] \p{Punct} \pP
> isspace [[:space:]] \p{Space}
> isupper [[:upper:]] \p{Upper} \p{Lu}
> isxdigit [[:xdigit:]] \p{XDigit} \p{Hex} [0-9a-fA-F]
> [[:word:]] \p{Word} \w
>
> C<\p{PerlSpace}> (C<\s>) is only the ASCII subdomain of C<\p{Space}>.
Not quite—\p{Space} includes \cK, which is ASCII, but \p{PerlSpace} does not include it.
> Character class C<word> is a Perl extension. There are hundreds more
The C<\p{Word}> character class....
> character classes and extensions. See L<perlunicode> and L<perluniprops>
s/props>/props>./
Thread Previous
|
Thread Next