Front page | perl.perl5.porters |
Postings from February 2022
Pre-RFC: Implement UNIVERSAL::import() in universal.c, and allowattributes to be used to mark subs for export.
Thread Next
From:
demerphq
Date:
February 23, 2022 11:43
Subject:
Pre-RFC: Implement UNIVERSAL::import() in universal.c, and allowattributes to be used to mark subs for export.
Message ID:
CANgJU+Wy7VEppLKZpw2FDSZRd4_p1Umz51aQqpspQeDy8-BdSg@mail.gmail.com
*INTRODUCTION:
Currently there is a special case in the core so that when the import
method is called against *any* classname and there is no such import
function defined the method call silently succeeds but does nothing. This
dates back to Perl 5.0 and is presumably so that when
use Whatever;
is translated into its definition of
BEGIN {
require Whatever;
Whatever->import();
}
no error is generated by the import call. The workaround was probably
chosen because at the time there was no universal.c which was added in
5.003 in 6d4a7be2b18d1674acf2ccc0da715a204e2d1ed0. Even when universal.c
was added no import() method of last resort was added.
Arguably this was a mistake. It means that a number of interesting issues
get swept under the carpet. It also exposes the possibility of interesting
bugs.
So for instance if someone typos a module name and does something like
this. Imagine a module "Thing", and someone writes:
Thng->import("whatever");
no error will be produced, no "whatever" will be imported, etc. This is
particularly relevant on case-insensitive file systems where a statement
like:
use List::util qw(sum);
will silently succeed, as the require logic will try to load the file "List/
util.pm" and the case insensitive file system will happily open the file
"List/Util.pm", however the import that will be executed will be
List::util->import("sum"), which wont exist, so the "sum" function will
*silently* not be imported. The result will be a very confused developer.
Another example of where this was a mistake is what happens if someone
takes a reference to the UNIVERSAL::import function. This will create a
stub function in the UNIVERSAL namespace, more or less the same as if
someone did a forward declaration of the sub, which will die if it is not
"filled in" with a proper implementation. If someone does this then any
further use statements for modules which do not define an import method
method (such as OO class modules) will die with an error:
perl -le'BEGIN { my $stub=\&UNIVERSAL::import; } use File::Spec; '
Undefined subroutine &UNIVERSAL::import called at -e line 1.
BEGIN failed--compilation aborted at -e line 1.
Why would anyone do this? Good question, but there is at least one example
bundled with Perl right now:
dist/autouse/lib/autouse.pm: $import ==
\&UNIVERSAL::import)
If someone uses autouse in the wrong way it will break. (autouse seems
pretty dodgy to me actually, someone should review it, it seems to have an
unhealthy relationship with Exporter, but there are other modules that
export independent of Exporter.pm.)
Another example of problems this hides is people who write things like:
use File::Spec qw(catfile);
which is also currently in the blead codebase. File::Spec is an OO module.
It does not export anything and it does not define an import method. So
this code silently loads File::Spec, fools people into thinking it imports
"catfile", but actually does not.
git grep "use File::Spec " | grep catfile
cpan/Test-Simple/t/Test2/modules/IPC/Driver/Files.t:use File::Spec
qw/catfile/;
*PART 1:
If we define a UNIVERSAL::import() function in universal.c then we can fix
the bug related to taking a reference to &UNIVERSAL::import, and we can
implement logic that at least warns or dies (my preference) when someone
tries to pass arguments into a non-existent import method. At least then
the List::util case would throw an error and people on case insensitive
file systems would have a similar experience to those on case sensitive
ones: a fatal exception when trying to import a symbol from a non-existent
package. This would go some way to resolving a recurring complaint of
people using Perl on case insensitive file systems.
I have implemented much of this logic
in yves/fix_universal_import_fragility aka
https://github.com/Perl/perl5/pull/19419
*PART 2
When discussing this patch on the #p5p irc channel ilmari pointed out that
adding a UNIVERSAL::import() method would make it possible to make the
Exporter module redundant. In effect we would move much of the symbol
exporting behavior out of Exporter.pm and into C code in universal.c. Thus
instead of having to explicitly inherit from Exporter, or to import its
import function we could simply write:
package Whatever;
@EXPORT_OK= qw(something);
sub something { ... }
and things like:
use Whatever qw(something);
would "just work". Exporter.pm has a relatively stable API as far as I
know, and it would not be too difficult to translate it into C. This would
mean that loading modules which export would be faster (because Exporter.pm
would not have to be loaded and compiled, and because the import logic
would be implemented in C) and use less memory (because we would not need
to add import subs to all kinds of modules namespaces).
* PART 3.
Ilmari also pointed out that if we defined a new attribute, say "EXPORT_OK"
or "EXPORT" or both, then we could use attributes to populate the
@EXPORT/@EXPORT_OK array, which would then allow us to avoid the need to
have double entry for exported methods. Maybe we could even add an
attribute that allowed people to define which export tags a sub should be
exported under (I don't know the attributes api that well right now so I
dont know for sure if this is possible). So for instance we could then
write:
package Thing;
sub something :EXPORT_OK {
...
}
sub wotzit :EXPORT {
...
}
This attribute would cause the symbol 'something' to be added to
the @Thing::EXPORT_OK array, and the symbol 'wotzit' to the @EXPORT array,
this would then allow UNIVERSAL::import() logic defined in Part 2 to
perform the task of Exporter.pm. Personally I have always found the double
entry requirements of exporting methods annoying, and a regular source of
error. Eg, changing an exported subs name without changing its name in the
@EXPORT_OK arrays, and imo it would reduce the number of bugs from
forgetting to actually add the sub name to the relevant array but
documenting that it is exported.
One issue this raises is people who would want to mix the attribute form
with the declarative form. People that want to do this would need to do the
declarative part first, in a BEGIN, and then the attributes later. Perhaps
we could do some magic under-the-hood to detect if people forget the BEGIN
and warn, or we could just document 'dont do that then'. Eg:
package Thing;
BEGIN {
@EXPORT_OK= qw( throbnitz );
}
sub throbnitz {
...
}
sub something :EXPORT_OK {
...
}
sub wotzit :EXPORT {
...
}
Would safely make sure that 'throbnitz' and 'something' were populated into
the @EXPORT_OK array, and that 'wotzit' would be populated into the @EXPORT
array.
Assuming it is possible to have variable level built in attributes we could
also make this work:
our $CONTROL :EXPORT;
and have '$CONTROL' be injected into the @EXPORT array similarly to how we
would with sub names.
* CONCLUSION
I have implemented Part 1 already in the branch
yves/fix_universal_import_fragility and PR
https://github.com/Perl/perl5/pull/19419. I have not yet implemented Part
2, but would be willing to do the work. I would also be willing to do Part
3, but probably would need some help as I don't have much experience
working with attributes.
Thank you for your time and consideration in reading this Pre-RFC.
cheers,
Yves
--
perl -Mre=debug -e "/just|another|perl|hacker/"
Thread Next
-
Pre-RFC: Implement UNIVERSAL::import() in universal.c, and allowattributes to be used to mark subs for export.
by demerphq