develooper Front page | perl.perl5.porters | Postings from December 2021

Re: Pre-RFC: `unknown` versus `undef`

Thread Previous | Thread Next
From:
Oodler 577 via perl5-porters
Date:
December 18, 2021 12:27
Subject:
Re: Pre-RFC: `unknown` versus `undef`
Message ID:
Yb3Tr5DlutyOabHC@odin.sdf-eu.org
* Oodler 577 via perl5-porters <perl5-porters@perl.org> [2021-12-18 12:20:35 +0000]:

> Top posting a few points:
> 
> 1. in DBI:: modules, NULL is equivalent to undef
> 2. in real world tables, "employee" usually has a "type" field (or external way to derive this attribute); volunteers (interns?), salaried, and hourlies are not differentiated by the value of the "salary" field
> 3. seems like the polymorphic OOP solution is to just have:
> 
> * a way to "get_employee_type" on the employee instance, and
> * throw an exception if $emp->get_salary is called on $emp,
>   C<if ($emp->get_employee_type == VOLUNTOLD)>.
> 
> I can add that the nuances between undef and q{} have caused me confustion
> in the past; but adding another special value to mean a type of "nothing"
> could be problematic. For example, to point #1 above, how is a DBI call to
> know what you mean when it already treats undef as NULL when a) replacing
> place holders (used almost universally), or b) turn NULLs into undef when
> returning results from a C<select_*> call?
> 
> With C<use warnings>,

oof - forget to add; with warnings on the followings all yell at you; but
even if not enabled; C<q{}> and C<undef> get coerced to C<0> in the numerical
context; and in the string context C<undef> gets coerced to C<q{}>.

> 
> TRUE:
>   (q{}   == 0)
>   (undef == 0)
> 
> FALSE
>   (q{}   eq 0)
>   (undef eq 0)
> 
> What makes matters somewhat more confusing is that C<int> returns 0 for
> both, even with C<use warnings>:
> 
>   int undef
>   int q{}
> 
> Though, this is implied in the TRUE/FALSE examples provided above. In
> addition to this, it seems like this would mess with C<defined>; would
> this add a possible answer to that? I'm just now able to remember that
> C<defined undef> is rightly different than C<defined 0> or C<defined q{}>.
> 
> Anyway, I can't say I've ever wanted more out of a numerical field
> other than its value. I'd never use undef or NULL to indicate anything
> beyond that this field was not set with an actual value. And if I wanted
> to know of $employee was a volunteer, I'd consult the "employee_type"
> column.
> 
> Seems like this is best left to a module that overrides operators; but
> the question for me remains - how would you represent this in a traditional
> database other than storing either as a separate column or some sort
> of composite value (e.g,; "0;volunteer") that this value is not to be
> taken like the rest of the numbers? 
> 
> Cheers,
> Brett
> 
> * Ovid via perl5-porters <perl5-porters@perl.org> [2021-12-18 11:09:02 +0000]:
> 
> > Yes, SQL NULL is broken in fundamental ways that CJ Date shows here: https://www.oreilly.com/library/view/sql-and-relational/9781449319724/ch04s04.html
> > 
> > And yes, I've been bitten by that bug in SQL in real-world code. Once. In over two decades. And I write lots of SQL. *Most* of the time, however, the 3VL NULL is what we need. Can you imagine if NULL followed "undef" behavior?
> > 
> >     SELECT count(*) FROM things WHERE value > ?;
> > 
> > That would be a disaster and it's easily replicable in Perl:
> > 
> >     my $total = grep { $_->value > $limit } @things;
> > 
> > I, for one, am tired of writing code like this:
> > 
> >     my $total = grep { defined $_->value ? $_->value > $limit : 0 } @things;
> > 
> > Note: the following is *not* equivalent to the above:
> > 
> >     my $total = grep { ( $_->value // 0 )  > $limit } @things;
> > 
> > I mean, it *looks* correct, but what if the value can be a negative number and the limit can be negative? You probably than want this:
> > 
> >     my $total = grep { ( $_->value // ( $limit - 1 ) )  > $limit } @things;
> > 
> > Which arguably might be more confusing than using defined. With 3VL, we have this:
> > 
> >     my $total = grep { $_->resolution < $limit } @things;
> > 
> > Worse, I'm tired of tracking down bugs caused by this.
> > 
> > 2VL logic on undef/null values been broken for a long time and forces developers to remember to always write special case code to handle this.
> > 
> > However, while we could correct the underlying issue, going further into 4VL or 5VL adds complications that I doubt most developers are going to understand. In other words, SIMPLICITY IS YOUR FRIEND.
> > 
> > We don't need "perfect" because making something that covers all possible cases is simply going to be a mess and might even be counter-productive. For example, if you're unauthorized to get a value but you see that it's a "known defined value", that's an information leak. Also, given Merijn's original list:
> > 
> > 1. Known defined value
> > 2. Known undefined value
> > 3. Unknown value
> > 4. Unauthorized to get the value
> > 5. Value is defined but unauthorized to get it
> > 
> > I don't see how 4+1 is different from 5. So we can bikeshed this to death, or fix the major underlying problem: $salary += 1000. Congrats. You've just given a raise to an unpaid volunteer.
> > 
> > Best,
> > Ovid
> > -- 
> > IT consulting, training, specializing in Perl, databases, and agile development
> > http://www.allaroundtheworld.fr/. 
> > 
> > Buy my book! - http://bit.ly/beginning_perl
> > 
> > 
> > 
> > 
> > 
> > 
> > On Saturday, 18 December 2021, 11:20:06 CET, Darren Duncan <darren@darrenduncan.net> wrote: 
> > 
> > 
> > 
> > 
> > 
> > For the record, which I partially discussed on a related Twitter thread a few 
> > days ago, I feel that using anything other than 2VL in a fundamental capacity is 
> > a serious mistake, and if anyone is considering 3VL, 4VL, etc then they probably 
> > have a design flaw that should be corrected in some other way.
> > 
> > All the regular types and operators should operate with pure 2VL, including all 
> > the regular equality or comparison or sorting operators.  The behavior of 
> > regular operators should not be overloaded or overridden, lexically or 
> > otherwise, so that they behave in a 3+VL manner.  This would be a huge source of 
> > bugs where people look at code expecting certain behavior and getting something 
> > else.
> > 
> > How something like a special Unknown value should work is that it provides a set 
> > of operators/subs with DIFFERENT NAMES that provide the 3VL etc logic, and so 
> > for example one writes:
> > 
> >   eq_3vl($x,$y)
> >   lt_3vl($x,$y)
> >   grep_3vl(...)
> > 
> > Or for simple 3VL you don't even need the special Unknown value, instead these 
> > operators can treat the standard undef that way.  The Unknown value is more 
> > useful if you want to override the behavior of existing operators.
> > 
> > As for 4VL, 5VL, etc, once you even start thinking about that, there's an even 
> > stronger case that what you really should be using is 2VL with a bunch of 
> > singleton types, where each singleton represents a specific reason a normal 
> > value is missing, such as:
> > 
> >   Unknown
> >   Not Applicable
> >   Permission Denied
> >   Record Not Found
> >   etc
> > 
> > The idea of changing the behavior of undef even lexically with a feature is 
> > problematic.  What if someone sees code in such a file and copies it into 
> > another file, or in reverse, where the other doesn't have that feature declared, 
> > then code which is exactly the same has changed behavior.
> > 
> > As a compromise, I would find either of these 2 things acceptable:
> > 
> > 1. Have an Unknown::Values or whatever singleton class which overrides built-in 
> > operators/subs but its effects are tightly bound to instances of that class.
> > 
> > 2. Declare new operators/subs with new names that provide 3VL with standard undefs.
> > 
> > Those provide this 3VL opt-in and explicitly if users want it, and its 
> > relatively easy for users reading the code later to know its using 3VL.
> > 
> > But behavior of built-in operators or undefs should never change as the result 
> > of a feature pragma or such.
> > 
> > Also, SQL NULLs are not actually 3VL, they are much more complicated than that, 
> > and we don't want to try and imitate SQL if we want to provide 3VL.
> > 
> > -- Darren Duncan
> > 
> > On 2021-12-18 1:43 a.m., H.Merijn Brand wrote:
> > > On Sat, 18 Dec 2021 08:57:05 +0000 (UTC), Ovid via perl5-porters <perl5-porters@perl.org> wrote:
> > > 
> > >> Hi there,
> > >>
> > >> As most of you know, "undef" values often cause all sorts of interesting bugs in Perl. I wrote https://metacpan.org/pod/Unknown::Values to address this. Instead of the 2VL that undef uses, it uses Kleene's traditional 3VL (three-value logic) akin to SQL's NULL.
> > > 
> > >  1. Known defined value
> > >  2. Known undefined value
> > >  3. Unknown value
> > >  4. Unauthorized to get the value
> > >  5. Value is defined but unauthorized to get it
> > > 
> > > When doing 3VL, number 4 is essential
> > > 
> > >> Basic usage looks like this:
> > >>
> > >>      use Unknown::Values;
> > >>  
> > >>      my $value  = unknown;
> > >>      my @array  = ( 1, 2, 3, $value, 4, 5 );
> > >>      my @less    = grep { $_ < 4 } @array;   # (1,2,3)
> > >>      my @greater = grep { $_ > 3 } @array;   # (4,5)
> > >>  
> > >>      my @underpaid;
> > >>      foreach my $employee (@employees) {
> > >>      
> > >>          # this will never return true if salary is "unknown"
> > >>          if ($employee->salary < $threshol ) {
> > >>              push @underpaid => $employee;
> > >>              }
> > >>          }
> > >>
> > >> I've also written about this here: http://blogs.perl.org/users/ovid/2013/02/three-value-logic-in-perl.html
> > >>
> > >> I've always thought this belongs directly in a programming language, but never suggested this because I assumed there would be no interest
> > >>
> > >> To my surprise, brian d foy suggested it be in the core (https://twitter.com/briandfoy_perl/status/1471684211602042880)
> > >>
> > >> He wrote: "Unknown::Value from @OvidPerl looks very interesting. These objects can't compare, do math, or most of the other default behavior that undef allows. This would be awesome in core."
> > >>
> > >> Would there be interest?
> > > 
> > > Yes, when 4VL (or 5VL)
> > > 
> > > Thinking about it, there might be a hook to add 5, 6, 7, 8, 9 etc :)
> > > 
> > >> Ovid
> > > 
> > 
> > 
> 
> -- 
> --
> oodler@cpan.org
> oodler577@sdf-eu.org
> SDF-EU Public Access UNIX System - http://sdfeu.org
> irc.perl.org #openmp #pdl #native
> 

-- 
--
oodler@cpan.org
oodler577@sdf-eu.org
SDF-EU Public Access UNIX System - http://sdfeu.org
irc.perl.org #openmp #pdl #native

Thread Previous | Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About