develooper Front page | perl.perl5.porters | Postings from February 2022

Re: trim vs trimmed revisited

Thread Previous | Thread Next
From:
demerphq
Date:
February 24, 2022 08:53
Subject:
Re: trim vs trimmed revisited
Message ID:
CANgJU+UCY8d7dToSQ8iZu38gCdW1ukS1vjOa5dY8J=ZoTHYUAA@mail.gmail.com
On Thu, 24 Feb 2022 at 06:48, Darren Duncan <darren@darrenduncan.net> wrote:

> On 2022-02-23 8:40 p.m., demerphq wrote:
> > To paraphrase what you said above: I hear that there is some wish that
> Perl had
> > immutable strings.  But it doesn't, and making new functions that
> correspond to
> > that forlorn hope violates all existing precedents and will only sow
> confusion.
>
> Depending what "immutable strings" means, Perl DOES have them, because it
> is a
> regular scalar type that is not a reference type.
>

"Immutable strings" normally means "where a malloced buffer is never
changed once set", (eg Java) and it is this definition that I meant.

Perl does not have immutable strings.

perl -MDevel::Peek -le'my $s="0123"; while(length $s) {Dump($s); chop($s)}'
SV = PV(0x55c52e258fd0) at 0x55c52e27fa60
  REFCNT = 1
  FLAGS = (POK,IsCOW,pPOK)
  PV = 0x55c52e2bb580 "0123"\0
  CUR = 4
  LEN = 10
  COW_REFCNT = 1
SV = PV(0x55c52e258fd0) at 0x55c52e27fa60
  REFCNT = 1
  FLAGS = (POK,pPOK)
  PV = 0x55c52e287b80 "012"\0
  CUR = 3
  LEN = 10
SV = PV(0x55c52e258fd0) at 0x55c52e27fa60
  REFCNT = 1
  FLAGS = (POK,pPOK)
  PV = 0x55c52e287b80 "01"\0
  CUR = 2
  LEN = 10
SV = PV(0x55c52e258fd0) at 0x55c52e27fa60
  REFCNT = 1
  FLAGS = (POK,pPOK)
  PV = 0x55c52e287b80 "0"\0
  CUR = 1
  LEN = 10

You can see this by looking at the PV buffer, each chop basically results
in us decrementing CUR, and overwriting the last character in the PV buffer
with a null.

There is also the offset-ok hack:

$ perl -MDevel::Peek -le'my $s="0123"; Dump($s); while(length $s)
{substr($s,0,1,""); Dump($s)}'
SV = PV(0x558caca3afd0) at 0x558caca61a90
  REFCNT = 1
  FLAGS = (POK,IsCOW,pPOK)
  PV = 0x558caca9d5b0 "0123"\0
  CUR = 4
  LEN = 10
  COW_REFCNT = 1
SV = PV(0x558caca3afd0) at 0x558caca61a90
  REFCNT = 1
  FLAGS = (POK,OOK,pPOK)
  OFFSET = 1
  PV = 0x558cacaa52c1 ( "\1" . ) "123"\0
  CUR = 3
  LEN = 9
SV = PV(0x558caca3afd0) at 0x558caca61a90
  REFCNT = 1
  FLAGS = (POK,OOK,pPOK)
  OFFSET = 2
  PV = 0x558cacaa52c2 ( "\1\2" . ) "23"\0
  CUR = 2
  LEN = 8
SV = PV(0x558caca3afd0) at 0x558caca61a90
  REFCNT = 1
  FLAGS = (POK,OOK,pPOK)
  OFFSET = 3
  PV = 0x558cacaa52c3 ( "\1\2\3" . ) "3"\0
  CUR = 1
  LEN = 7
SV = PV(0x558caca3afd0) at 0x558caca61a90
  REFCNT = 1
  FLAGS = (POK,OOK,pPOK)
  OFFSET = 4
  PV = 0x558cacaa52c4 ( "\1\2\3\4" . ) ""\0
  CUR = 0
  LEN = 6

In older perls you could also see it in s///, but apparently that
optimization has been broken. :-(


> As far as I'm concerned, the main badness for "mutable strings" is when
> this
> happens:
>
>    my $s1 = "abc";
>    my $s2 = $s1;
>    # do something that modifies the string in $s2
>    # now $s1 has also been modified
>

No argument there. That would be very bad.  :-)

That can only happen if $s2 is an alias to $s1. Which can be achieved with
various XS modules, or even pure perl if you are willing to accept a
wrapper:

$ perl -MDevel::Peek -le'my $s="0123"; print $s; Dump($s); my
$alias_ary=sub{\@_}->($s); Dump($alias_ary->[0]);
substr($alias_ary->[0],0,1,""); print $s;'
0123
SV = PV(0x55868b3d8fd0) at 0x55868b3ffab0
  REFCNT = 1
  FLAGS = (POK,IsCOW,pPOK)
  PV = 0x55868b4103e0 "0123"\0
  CUR = 4
  LEN = 10
  COW_REFCNT = 1
SV = PV(0x55868b3d8fd0) at 0x55868b3ffab0
  REFCNT = 2
  FLAGS = (POK,IsCOW,pPOK)
  PV = 0x55868b4103e0 "0123"\0
  CUR = 4
  LEN = 10
  COW_REFCNT = 1
123

Array::RefElem is a common route to creating aliases, as is Data::Alias.


>
> The above description is the case when $s1 was assigned some kind of
> reference
> type like an arrayref.  But it isn't the case for an ordinary Perl string.
>
> For existing cases, does "chomp($s2)" or "$s2 =~ s///" etc REALLY modify
> the
> string itself, or does it just derive a new string and assign it to $s2?


Depending on the operation it modifies the PV buffer in place without
allocating a new buffer and copying the changes into it (like Java would).
We do not allocate a new buffer and then deallocate the other buffer (like
Java would modulo GC).


> From
> the user's perspective I would say make new and assign is what actually
> happens.
>

Depends. With COW that is NOT what happens.

$ perl -MDevel::Peek -e'my $s1="foo"; my $s2=$s1; Dump($s1); Dump($s2);'
SV = PV(0x55debbe32fd0) at 0x55debbe59ae8
  REFCNT = 1
  FLAGS = (POK,IsCOW,pPOK)
  PV = 0x55debbe95660 "foo"\0
  CUR = 3
  LEN = 10
  COW_REFCNT = 2
SV = PV(0x55debbe33070) at 0x55debbe59b00
  REFCNT = 1
  FLAGS = (POK,IsCOW,pPOK)
  PV = 0x55debbe95660 "foo"\0
  CUR = 3
  LEN = 10
  COW_REFCNT = 2

Here you can see two SV structures (note the different addresses) sharing
one PV buffer (note the addresses).


>
> So unless I'm wrong about how things work, Perl's strings ARE immutable,


At least by the definition I used Perl definitely does *not* have immutable
strings.


> and
> what we have here is functions or operations that assign a result to the
> variable they got their input from, which is not the same thing.
>

I think you should play around with Devel::Peek and read the core code a
bit, youll see that perl definitely does not have immutable strings by the
"normal definition" of Immutable strings. We modify buffers in place all
the time.

Cheers
Yves

-- 
perl -Mre=debug -e "/just|another|perl|hacker/"

Thread Previous | Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About