Front page | perl.perl5.porters |
Postings from February 2022
Re: trim vs trimmed revisited
Thread Previous
|
Thread Next
From:
demerphq
Date:
February 24, 2022 08:53
Subject:
Re: trim vs trimmed revisited
Message ID:
CANgJU+UCY8d7dToSQ8iZu38gCdW1ukS1vjOa5dY8J=ZoTHYUAA@mail.gmail.com
On Thu, 24 Feb 2022 at 06:48, Darren Duncan <darren@darrenduncan.net> wrote:
> On 2022-02-23 8:40 p.m., demerphq wrote:
> > To paraphrase what you said above: I hear that there is some wish that
> Perl had
> > immutable strings. But it doesn't, and making new functions that
> correspond to
> > that forlorn hope violates all existing precedents and will only sow
> confusion.
>
> Depending what "immutable strings" means, Perl DOES have them, because it
> is a
> regular scalar type that is not a reference type.
>
"Immutable strings" normally means "where a malloced buffer is never
changed once set", (eg Java) and it is this definition that I meant.
Perl does not have immutable strings.
perl -MDevel::Peek -le'my $s="0123"; while(length $s) {Dump($s); chop($s)}'
SV = PV(0x55c52e258fd0) at 0x55c52e27fa60
REFCNT = 1
FLAGS = (POK,IsCOW,pPOK)
PV = 0x55c52e2bb580 "0123"\0
CUR = 4
LEN = 10
COW_REFCNT = 1
SV = PV(0x55c52e258fd0) at 0x55c52e27fa60
REFCNT = 1
FLAGS = (POK,pPOK)
PV = 0x55c52e287b80 "012"\0
CUR = 3
LEN = 10
SV = PV(0x55c52e258fd0) at 0x55c52e27fa60
REFCNT = 1
FLAGS = (POK,pPOK)
PV = 0x55c52e287b80 "01"\0
CUR = 2
LEN = 10
SV = PV(0x55c52e258fd0) at 0x55c52e27fa60
REFCNT = 1
FLAGS = (POK,pPOK)
PV = 0x55c52e287b80 "0"\0
CUR = 1
LEN = 10
You can see this by looking at the PV buffer, each chop basically results
in us decrementing CUR, and overwriting the last character in the PV buffer
with a null.
There is also the offset-ok hack:
$ perl -MDevel::Peek -le'my $s="0123"; Dump($s); while(length $s)
{substr($s,0,1,""); Dump($s)}'
SV = PV(0x558caca3afd0) at 0x558caca61a90
REFCNT = 1
FLAGS = (POK,IsCOW,pPOK)
PV = 0x558caca9d5b0 "0123"\0
CUR = 4
LEN = 10
COW_REFCNT = 1
SV = PV(0x558caca3afd0) at 0x558caca61a90
REFCNT = 1
FLAGS = (POK,OOK,pPOK)
OFFSET = 1
PV = 0x558cacaa52c1 ( "\1" . ) "123"\0
CUR = 3
LEN = 9
SV = PV(0x558caca3afd0) at 0x558caca61a90
REFCNT = 1
FLAGS = (POK,OOK,pPOK)
OFFSET = 2
PV = 0x558cacaa52c2 ( "\1\2" . ) "23"\0
CUR = 2
LEN = 8
SV = PV(0x558caca3afd0) at 0x558caca61a90
REFCNT = 1
FLAGS = (POK,OOK,pPOK)
OFFSET = 3
PV = 0x558cacaa52c3 ( "\1\2\3" . ) "3"\0
CUR = 1
LEN = 7
SV = PV(0x558caca3afd0) at 0x558caca61a90
REFCNT = 1
FLAGS = (POK,OOK,pPOK)
OFFSET = 4
PV = 0x558cacaa52c4 ( "\1\2\3\4" . ) ""\0
CUR = 0
LEN = 6
In older perls you could also see it in s///, but apparently that
optimization has been broken. :-(
> As far as I'm concerned, the main badness for "mutable strings" is when
> this
> happens:
>
> my $s1 = "abc";
> my $s2 = $s1;
> # do something that modifies the string in $s2
> # now $s1 has also been modified
>
No argument there. That would be very bad. :-)
That can only happen if $s2 is an alias to $s1. Which can be achieved with
various XS modules, or even pure perl if you are willing to accept a
wrapper:
$ perl -MDevel::Peek -le'my $s="0123"; print $s; Dump($s); my
$alias_ary=sub{\@_}->($s); Dump($alias_ary->[0]);
substr($alias_ary->[0],0,1,""); print $s;'
0123
SV = PV(0x55868b3d8fd0) at 0x55868b3ffab0
REFCNT = 1
FLAGS = (POK,IsCOW,pPOK)
PV = 0x55868b4103e0 "0123"\0
CUR = 4
LEN = 10
COW_REFCNT = 1
SV = PV(0x55868b3d8fd0) at 0x55868b3ffab0
REFCNT = 2
FLAGS = (POK,IsCOW,pPOK)
PV = 0x55868b4103e0 "0123"\0
CUR = 4
LEN = 10
COW_REFCNT = 1
123
Array::RefElem is a common route to creating aliases, as is Data::Alias.
>
> The above description is the case when $s1 was assigned some kind of
> reference
> type like an arrayref. But it isn't the case for an ordinary Perl string.
>
> For existing cases, does "chomp($s2)" or "$s2 =~ s///" etc REALLY modify
> the
> string itself, or does it just derive a new string and assign it to $s2?
Depending on the operation it modifies the PV buffer in place without
allocating a new buffer and copying the changes into it (like Java would).
We do not allocate a new buffer and then deallocate the other buffer (like
Java would modulo GC).
> From
> the user's perspective I would say make new and assign is what actually
> happens.
>
Depends. With COW that is NOT what happens.
$ perl -MDevel::Peek -e'my $s1="foo"; my $s2=$s1; Dump($s1); Dump($s2);'
SV = PV(0x55debbe32fd0) at 0x55debbe59ae8
REFCNT = 1
FLAGS = (POK,IsCOW,pPOK)
PV = 0x55debbe95660 "foo"\0
CUR = 3
LEN = 10
COW_REFCNT = 2
SV = PV(0x55debbe33070) at 0x55debbe59b00
REFCNT = 1
FLAGS = (POK,IsCOW,pPOK)
PV = 0x55debbe95660 "foo"\0
CUR = 3
LEN = 10
COW_REFCNT = 2
Here you can see two SV structures (note the different addresses) sharing
one PV buffer (note the addresses).
>
> So unless I'm wrong about how things work, Perl's strings ARE immutable,
At least by the definition I used Perl definitely does *not* have immutable
strings.
> and
> what we have here is functions or operations that assign a result to the
> variable they got their input from, which is not the same thing.
>
I think you should play around with Devel::Peek and read the core code a
bit, youll see that perl definitely does not have immutable strings by the
"normal definition" of Immutable strings. We modify buffers in place all
the time.
Cheers
Yves
--
perl -Mre=debug -e "/just|another|perl|hacker/"
Thread Previous
|
Thread Next