develooper Front page | perl.perl5.porters | Postings from February 2012

Re: [perl #109408] Documentation that refers to Perl 5 as new

Thread Previous | Thread Next
Tom Christiansen
February 1, 2012 17:31
Re: [perl #109408] Documentation that refers to Perl 5 as new
Message ID:
Abigail <> wrote
   on Wed, 01 Feb 2012 15:32:30 +0100: 

> On Tue, Jan 31, 2012 at 08:27:02PM -0700, Tom Christiansen wrote:

>> =item *
>> Perl pattern matching uses Unicode rules for case-insensitivity, but Python
>> uses only ASCII casefolding rules, but Perl uses Unicode casefolding rules,
>> so (for example) all three Greek sigmas match case-insensitively in Perl.

> I cannot make head nor tails out of this sentence. It starts of with
> Perl's ability to do case insensitive Unicode matching, contrasts that
> with casefolding in Python, then contrast that with Unicode casefolding
> in Perl. Too many "but"s to my taste, and IMO, you should either mention
> casefolding three times, or case-insensitivity three times.

You're right, you're right, you're right.  It's blather.

>> =item * 
>> Not all functions need be methods in Perl.

> "need to be"?

Turns out that "need" can be an uninflected modal in English, as in "...if
need be", or in "She need not call again."  I can see why you thought to put
the "to" there, though, since the uninflected modal use is somewhat less
common in regular speech than the more normal regular inflected form.  Here's
sense 10. c. for need, v.2 from the OED3:

 10. c. trans. With bare infinitive.

    In modern use chiefly in non-affirmative contexts, i.e. in negative clauses,
    with near-negative adverbs such as but and hardly, in as, if, or than clauses,
    in statements expecting or implying a negative response, or in interrogative
    clauses; also with only (usually immediately following the verb).

     (a) In a negative clause or a context with negative implications.
         In modern use the 3rd person singular inflected form is less
         common than the uninflected (β) form.

Citations for the (α) form include:

    1855    Tennyson Maud xxiii. ix, in Maud & Other Poems 80   Who knows‥Whether I need have fled?
    1875    B. Jowett tr. Plato Dialogues (ed. 2) V. 370,   I need hardly ask again.
    1991    What Personal Computer Dec. 19/3   It's a flat panel display with touch screen and stylus modes, so you never need touch a keyboard again.

and for the (β) form:

    1915    R. Frost Let. 11 Nov. (1964) 17   He needn't go calling himself sticky names like Gayheart in public.
    1921    D. H. Lawrence Women in Love (new ed.) xxx. 510   It was a relief to her to be acknowledged extraordinary. Then she need not fret about the common standards.
    1993    Guardian 23 Oct. (Weekend Suppl.) 42/4   Stock-making needn't be the labour-intensive grind described in French cookbooks.
    1997    N.Y. Times 16 Nov. i. 32/3   In baseball, for example, is there any rule saying that a second baseman need only be in the neighborhood of second base while middle-manning a double play?

So it's still current, but again, even a native speaker might well
consider whether a "to" might be called for there.  I just didn't 
put it in, is all.

>> =item * 
>> A Java C<char> is not an abstract Unicode code point; it is a UTF-16 code
>> unit, which means it takes two of Java C<char>s, and special coding, to work
>> outside the Basic Multilingual Plane in Java.  In contrast, a Perl character
>> I<is> an abstract code point, whose underlying implementation is
>> intentionally hidden from the programmer.  Perl code automatically works 
>> on the full range of Unicode—and beyond.

> Well, I grant you that the intent was to hide it from the programmer.
> Unfortunally, in practise, the implementation is often exposed to the
> programmer.

It seems like (nearly?) everything in Java has a 16-bit code-unit "char"
interface, but only a few things have a 32-bit code-point "int" interface.  
It makes it really clunky.  This is what I was trying to allude to.  You
don't get to deal with logical characters as often as you'd like to be 
able to in Java.

>> =item * 
>> Perl supports pass by named parameter, allowing optional arguments to
>> be omitted and the argument order freely rearranged.

> There's support in Perl for named parameters other than that Perl doesn't
> prevent the programmer from rolling their own named parameter support?

It's not really possible in most other languages of my acquaintance.

    my %args = @_;

is pretty powerful, actually, and you just can't do that in things like 
Java.  But this is one of those things where I'm coming at it from the 
wrong direction again, I wager.

>> =item * 
>> Perl’s garbage collection system is based on reference counting, so it is possible
>> to write a destructor to automatically clean up resources like open file descriptors,
>> database connections, file locks, etc.

> I don't see why reference counting is neccessary to be able to write
> destructors to clean up resources. It's true that Perl uses reference
> counting, and that it's possible to write such a destructor, but I don't
> see the connection.


It's that you cannot guarantee that destructors are *ever* called
in Java — nor, now because of J(ava-P)ython, in Python either.
That means you cannot hope to have a destructor duly called to
free up a non-memory resource.  They might never happen at all,
and in fact, in most JVM implementations, never do get called
at all in the normal course of running.  At all, I said.  Scary.

It's a very different resource-management (non-)strategy than 
we're accustomed to in Perl, where the only non-determinism I'm
aware of with destructors is their order of firing when several
logically achieve a refcount of 0 "simultaneously".

I'm not counting circularities, though, since those don't normally get dealt
with till thread shutdown time.  But at least we guarantee that they *shall*,
assuming you aren't an immortal daemon process.

>> =item * 
>> Perl regexes don’t need extra backslashes.
>> =item * 
>> Perl has regex literals, which the compiler compiles and syntax
>> checks them at compile time, and stores for efficiency.

> Hmmm. The efficiency is only there if you use them in such a way that it cuts
> down on compilation. It's very easy to get this wrong:

>     my $re = qr {PAT};
>     $str =~ /^$re/;    # Two compilations.
>     my $re = qq {PAT};
>     $str =~ /^$re/;    # One compilation.

The thing I'm trying to convey is that in contradistinction to the
bolted-on-the-side approach to regexes taken by both Java and Python,
in Perl the very compiler itself is aware of their existence.  This
allows it to syntax check and compile them at compile time, which
can essentially "never" happen in Java or Python.  (Ok, people who
are super-careful can do static initializers in Java for this, but
it's a real pain in the butt to manage, and should be done for you.)

I really appreciate your (and everyone's) help in all of this. I just
never became comfortable with my rough draft of this section, as it
never got rewritten.  So this helps it not be a complete embarrassment.

thanks again,


Thread Previous | Thread Next Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at | Group listing | About