develooper Front page | perl.perl5.porters | Postings from July 2017

Re: Behavior of bitwise ops on unencountered wide characters

Thread Previous | Thread Next
From:
demerphq
Date:
July 12, 2017 17:46
Subject:
Re: Behavior of bitwise ops on unencountered wide characters
Message ID:
CANgJU+VMqGvRGHHJ1tLs1wtGxU-wSV=gJpM-4g69vpfMVC4mTg@mail.gmail.com
On 12 July 2017 at 19:02, Karl Williamson <public@khwilliamson.com> wrote:
> On 07/12/2017 04:50 PM, Sawyer X wrote:
>>
>>
>>
>> On 07/11/2017 01:09 PM, Karl Williamson wrote:
>>>
>>> On 07/10/2017 11:12 PM, Father Chrysostomos wrote:
>>>>
>>>> Karl Williamson wrote:
>>>>>
>>>>> I don't yet have a fully formulated opinion on this, but one question I
>>>>> would have is "How is this different from division by 0" that people
>>>>> seem to deal ok with.
>>>>
>>>>
>>>> Fatal division by zero is ancient.  Fatalizing bitwise operations on
>>>> utf8 breaks stuff.
>>>>
>>>> As I suggested in another thread (I seem to have been ignored), it
>>>> would be *much* kinder to users to make it a warning.  (Wide character
>>>> in blah blah blah.)  That way users who care can fatalize it, or sup-
>>>> press it.  You have the best of all three worlds.
>>>>
>>>
>>> I believe I've referred to your suggestion in some thread.  It is the
>>> minimum we should do.  And others believe it should be deprecated.
>>
>>
>> There is a specific cost here Graham noted. This method is currently
>> used to determine if a variable is a number without loading "B", which
>> isn't cheap. While it is a simple argument of "users shouldn't care,"
>> serializations (like JSON) need to be able to map them to their right
>> type. It would be nice if there was a way to do this without B.
>>
>
> It would be good to have some alternative that requires only a cheaply
> loaded, or internal module, something named like "Internals" that provides a
> clear access path for the things we have determined warrant it, such as
> Graham's use case.  He had to explain to me how it worked, and he had to
> explain to Yves as well.

The problem is he isn't really correct. I have been down this path
before in Sereal and it hurts. See below.

> That demonstrates is is non-obvious.  When the
> tools aren't available, people will do clever, but non-maintainable things
> to get what they need.  But it is best to furnish the tools when it becomes
> known that they would be useful.

Unfortunately, I never replied to Graham, which I should have.

On 16 June 2017 at 13:04, Graham Knop <haarg@haarg.org> wrote:
> On Thu, Jun 15, 2017 at 3:55 AM, demerphq <demerphq@gmail.com> wrote:
>> On 9 June 2017 at 11:17, Graham Knop <haarg@haarg.org> wrote:
>>> The result of ($var ^ "") can tell you the
>>> status of the internal flags for a if value is a valid number, which
>>> is needed for serialization.
>>
>> You mean it can tell you if something is a number that has never been
>> stringified right?
>>
>> Can you explain this one a bit more?
>
> $number & "" -> 0
> $string & "" -> ""
>
> The flags being checked are SVp_IOK or SVp_NOK.  A number that has
> been stringified will still register as a number based on this check.

This is not always the case. These kind of checks are inherently problematic.

perl -MDevel::Peek -le'$s="0e1"; 0+$s; print $s & ""; Dump($s);'
0
SV = PVNV(0x1710550) at 0x1731b28
  REFCNT = 1
  FLAGS = (IOK,NOK,POK,pIOK,pNOK,pPOK)
  IV = 0
  NV = 0
  PV = 0x1720d80 "0e1"\0
  CUR = 3
  LEN = 16

$ perl -MDevel::Peek -le'$s=" 10 "; 0+$s; print $s & ""; Dump($s);'
0
SV = PVIV(0x1da3f60) at 0x1da9b28
  REFCNT = 1
  FLAGS = (IOK,POK,pIOK,pPOK)
  IV = 10
  PV = 0x1d98d80 " 10 "\0
  CUR = 4
  LEN = 16

$ perl -MDevel::Peek -le'$s="000"; 0+$s; print $s & ""; Dump($s);'
0
SV = PVIV(0x17c2f60) at 0x17c8b28
  REFCNT = 1
  FLAGS = (IOK,POK,pIOK,pPOK)
  IV = 0
  PV = 0x17b7d80 "000"\0
  CUR = 3
  LEN = 16

I am sorry, I very very very much sympathise with your desire to tell
numbers from strings, but you simply can't reliably use our flags to
do it. At least currently.

Our current flags do NOT provide a way to track the origin type of a
variable. It is that simple. We have discussed how we could change the
meaning of the flags so we /could/ track the origin type, but we have
not done so, and any code that tries to do so is inherently flawed.

cheers,
Yves

Another example, I think that the output of these two could
legitimately change if we were to optimise things:

$ perl -MDevel::Peek -le'$s=1; $s.""; print $s & ""; Dump($s);'
0
SV = PVIV(0x217bf50) at 0x2181b18
  REFCNT = 1
  FLAGS = (IOK,POK,pIOK,pPOK)
  IV = 1
  PV = 0x2170d70 "1"\0
  CUR = 1
  LEN = 16


$ perl -MDevel::Peek -le'$s=1; $s.=""; print $s & ""; Dump($s);'

SV = PVIV(0x222cf50) at 0x2232b18
  REFCNT = 1
  FLAGS = (POK,pPOK)
  IV = 1
  PV = 0x2221d70 "1"\0
  CUR = 1
  LEN = 16



-- 
perl -Mre=debug -e "/just|another|perl|hacker/"

Thread Previous | Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About