develooper Front page | perl.perl5.porters | Postings from August 2023

Re: PPC Elevator Pitch for Perl::Types

Thread Previous | Thread Next
From:
Oodler 577 via perl5-porters
Date:
August 22, 2023 04:19
Subject:
Re: PPC Elevator Pitch for Perl::Types
Message ID:
ZOQ3OPbbDHplnw2I@odin.sdf-eu.org
> From: Dave Mitchell
> Date: August 19, 2023 07:40
> Subject: Re: PPC Elevator Pitch for Perl::Types
> Message ID: ZOBx23dmJJLPb3Cg@iabyn.com

> On Sat, Aug 19, 2023 at 02:56:13AM +0000, Oodler 577 via perl5-porters wrote:
> > Does that help answer your initial questions?

> 'fraid not :-)

Sorry, we'll keep trying our best to explain!  ;-)

> If I am understanding correctly, Perl::Types is intended to be something
> which can be used stand-alone, independent of RPerl, and which you want to
> be bundled with the perl core.

Yes, that is absolutely correct.

> Within that context, how does adding 'use Perl::Types' at the top of a
> perl (not RPerl) program do its thing? What mechanism are you using that
> causes this line in a non-RPerl program to croak;

>     my number $x;
>     ...
>     $x = 'foo';  # croak

> e.g. is it a source filter, set magic attached to $x, or ...?

Yes you are correct, the current mechanism for enabling type-checking
for subroutine calls is essentially a type of source filter.  As
mentioned in our first reply:

> > The Perl compiler currently supports type enforcement for subroutine calls, so that is our starting point for Perl::Types.

This source-filter-like functionality is currently contained in
the Perl compiler's file `Class.pm` and is triggered by including
`use RPerl;` in a Perl source code file.  Similarly, this functionality
will eventually be triggered by the line `use Perl::Types;` instead,
once we complete the refactoring of `Perl::Types` into its own
distribution.

You can see how the `TYPE_CHECKING` preprocessor directive is
handled, with minor differences between the `ON` and `TRACE`
settings:

https://metacpan.org/release/WBRASWELL/RPerl-7.000000/source/lib/RPerl/CompileUnit/Module/Class.pm#L673-715

```perl
if ( $CHECK eq 'ON' ) {
    my $i = 0;                    # integer
    foreach my $subroutine_argument ( @{$subroutine_arguments} ) {
        # only enable type-checking for arguments of supported type;
        # NEED UPGRADE: enable checking of user-defined Class types & all other remaining RPerl types
        if (exists $TYPES_SUPPORTED->{$subroutine_argument->[0]}) {
            $subroutine_arguments_check_code .= q{    rperltypes::} . $subroutine_argument->[0] . '_CHECK( $_[' . $i . '] );' . "\n";  # does work, hard-code all automatically-generated type-checking code to 'rperltypes::' namespace
        }
        $i++;
    }
 
    activate_subroutine_args_checking( $package_name, $subroutine_name, $subroutine_type, $subroutine_arguments_check_code, $module_filename_long );
    $inside_subroutine         = 0;
    $subroutine_arguments_line = q{};
}
```

You can see that both `ON` and `TRACE` call `activate_subroutine_args_checking()`,
which is where the type-checking calls are inserted into a new subroutine,
that wraps around the original un-type-checked subroutine:

https://metacpan.org/release/WBRASWELL/RPerl-7.000000/source/lib/RPerl/CompileUnit/Module/Class.pm#L1016-1183

```perl
    # re-define subroutine call to include type checking code; new header style
    do
    {
        no strict;

        # create unchecked symbol table entry for original subroutine
        *{ $package_name . '::__UNCHECKED_' . $subroutine_name } = \&{ $package_name . '::' . $subroutine_name };  # short form, symbol table direct, not strict

        # delete original symtab entry, 
        undef *{ $package_name . '::' . $subroutine_name };

        # re-create new symtab entry pointing to checking code plus unchecked symtab entry
        $subroutine_definition_code .=
            '*' . $package_name . '::' . $subroutine_name . ' = sub { ' .
            $subroutine_definition_diag_code .
            ($subroutine_arguments_check_code or "\n") .
            '    return ' . $package_name . '::__UNCHECKED_' . $subroutine_name . '(@ARG);' . "\n" . '};';

        # create new checked symtab entries, for use by Exporter
        $check_code_subroutine_name = $package_name . '::__CHECK_CODE_' . $subroutine_name;
        $subroutine_definition_code .= "\n" . '*' . $package_name . '::__CHECKED_' . $subroutine_name . ' = \&' . $package_name . '::' . $subroutine_name . "\n" . ';';

        $subroutine_definition_code .= "\n" . '*' . $check_code_subroutine_name . ' = sub {' . "\n" . '    my $retval ' . q{ =<<'EOF';} . "\n" . $subroutine_arguments_check_code . "\n" . 'EOF' . "\n" . '};' . "\n";
    };

    eval($subroutine_definition_code) or (RPerl::diag('ERROR ECOPR02, PRE-PROCESSOR: Possible failure to enable type checking for subroutine ' . $package_name . '::' . $subroutine_name . '(),' . "\n" . $EVAL_ERROR . "\n" . 'not croaking'));
    if ($EVAL_ERROR) { croak 'ERROR ECOPR03, PRE-PROCESSOR: Failed to enable type checking for subroutine ' . $package_name . '::' . $subroutine_name . '(),' . "\n" . $EVAL_ERROR . "\n" . 'croaking'; }
```

Thus, any call to your example `mutate()` subroutine would actually
end up calling the new wrapper subroutine instead, which would
start off by type-checking the input argument(s) before proceeding
to run the original `mutate()` code.  (As mentioned before, this
would require adding `( my number $my_arg ) = @ARG;` to the top of
`mutate()`, so that `activate_subroutine_args_checking()` knows to
insert a call to `number_CHECK()` or `number_CHECKTRACE()`.)

We can enable type-checking for the subroutine return values by
simply extending this same `activate_subroutine_args_checking()`
to include a call to the appropriate `foo_CHECK()` or `foo_CHECKTRACE()`
right before returning back to the original caller.

Regarding your `my number $x; $x = 'foo';` example, this will
require the use of a `tie` or similar mechanism to intercept every
modification of `$x`, inserting a call to `number_CHECK()` or
`number_CHECKTRACE()` as previously described.  In the past, we
have always simply allowed the C(++) compiler to provide this
functionality for us; however, this will have to be explicitly
implemented when `Perl::Types` is refactored into its own distribution.
Arbitrarily-nested data structures (arrays and/or hashes) can be
recursively type-checked using the same `tie` approach, intercepting
any modifications to internal elements and calling `foo_CHECK()`
or `foo_CHECKTRACE()` for each change.

> Also, from your reply about my mutator example not croaking by default,
> this seems to imply that it's quite possible for $x to obtain a
> non-numeric value under some circumstances. So, by default, what
> circumstances will croak and which will silently allow $x to become
> something other than a number? E.g. which of these lines croaks?

>     $x = 'foo';
>     $x =~ s/1/a/;
>     substr($x,0,1) = 'foo';
>     my $y = substr($x,0,1);

Once the above-mentioned `tie` functionality is implemented, then
all four of the lines above would potentially give an error if
attempting to store a non-`number` value into the `number`-typed
`$x` variable.

> Just in general terms I'm still confused as to what effect adding
> 'Perl::Types' to my program will have, and how it achieves that effect.

If you set `TYPE_CHECKING` to `OFF` and don't use the Perl compiler,
then `use Perl::Types;` will not have any effect other than allowing
you to use the Perl data types (and their associated helper
subroutines), thereby improving the readability of your code.

If you set `TYPE_CHECKING` to `ON` or `TRACE` (default) and don't
use the Perl compiler, then `use Perl::Types;` will utilize the
source filter mechanism for type-checking errors on subroutine
entry and/or exit, and will also utilize the `tie` mechanism for
type-checking errors on variable modification, thereby improving
the correctness of your code.

If you set `TYPE_CHECKING` to `ON` or `TRACE` (default) and do use
the Perl compiler, then `use Perl::Types;` will utilize the C(++)
compiler for type-checking errors on subroutine entry and/or exit,
as well as variable modification, thereby improving the runtime
performance of your code.

So, as mentioned in the original Elevator Pitch, we can "utilize
Perl data types to achieve a number of benefits including but not
limited to":

* increased performance
* code correctness
* type safety (type checking)
* memory safety (bounds checking)
* documentation of code intent
* potential for polymorphism
* potential for derived or synthetic types

Currently, the Perl compiler has well over 4K individual test cases,
a large number of which relate to the Perl data type system.  The
exact number of tests which will be moved from the Perl compiler
into `Perl::Types` will be determined when the refactoring process
is complete.

https://metacpan.org/release/WBRASWELL/RPerl-7.000000/source/t

https://metacpan.org/release/WBRASWELL/RPerl-7.000000/source/lib/RPerl/Test

Are we making progress toward answering your initial questions?

On Behalf of the _Perl::Types Committee_,
Brett Estrade 

--
oodler@cpan.org
oodler577@sdf-eu.org
SDF-EU Public Access UNIX System - http://sdfeu.org
irc.perl.org #openmp #pdl #native

Thread Previous | Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About