develooper Front page | perl.perl5.porters | Postings from April 2021

a perl SMP safari

From:
B. Estrade
Date:
April 10, 2021 20:59
Subject:
a perl SMP safari
Message ID:
eb73d02d-7933-a287-5797-543ec2e0ebb2@cpanel.net
I. Introduction

TLDR; I am dropping the "thread" thing. I don't think this is helpful. 
It also doesn't affect the following, which has always been the goal - 
SMP/multi-core semantics for easier and more fun multi-core programming.

One thing about perl/Perl that I absolutely love is that there is a set 
of "secret" or "discovered" idioms and capabilities. For whatever 
reason, due to leveraging "it's a feature not a bug", the side effect of 
doing other things on purpose, or just straight wizardry we have things 
like Perl's set of secret operators [0]. Seems like the Schwarzian 
Transform [1] falls into this camp.

Given the introduction, I am on a similar search for the tiny cracks and 
opportunities in any part of the perl/Perl stack that might be 
sufficient (or necessary) hidden opportunities for "interesting".

II. A Hopeful Example of What's Possible Today: Coordination of Child 
Processes, using fork and the Atomicity of open(2)

One example, "atomicity". Child processes (and really of any origin) can 
exploit kernel level support for necessary atomics in the 'open' syscall 
via Perl's "sysopen" directly; this is also exposed by using "symlink" 
(See File::Symlink::Atomic). Therefore I can conclude that Perl has core 
support for a very basic form of atomics.

What's the use case for this that is useful now? The most basic form of 
this is a developer wishes to do some work using a number of fork'd 
processes. However, during the course of the work, there is a part that 
must be coordinated among non-communicating sequential processes (and 
this term, I believe is super helpful given our situation). Using the 
atomic capabilities of the operating system, well outside the scope of 
the perl processes, we can provide a wall that all processes must admit.

E.g.,

sub go_nucular {
   # do something that leverages sysopen or symlink's kernel
   # level atomicity
}

my $cid;
my $complicated_ds = _get_scarey_HoHoAoAoH(); # you get the idea
for (1..10) {
   $cpid = fork;
   if (not $cpid) {
     # kid does stuff
     ..
     # kid tries to do "atomic thing" until successful
     while (not my $w00t = go_nucular($complicated_ds) { cry() };

     # kid is done, mom calls for dinner; time to go home and eat tendies
     exit;
   }
}
# assume some form of wait or waitpid is here for the parent...

Notes about the above example:

1. is this a "lock" kind of but not really; it's optimistic; just like 
the kid who keeps trying to run through a wall; there is no explicit 
knowledge of the barrier to the child

2. not even the parent perl process is aware of this kernel level 
protection; if parent perl doesn't care in this case; there sure are 
other examples where we can use kernel level guarantees for other 
interesting things that the parent perl process doesn't need to know about

3. in reading up on fork just to make sure I didn't make a totally bone 
headed assertion, I read this:

"File descriptors (and sometimes locks on those descriptors) are shared, 
while everything else is copied." [2]

I gotta say, this is excited. It used the word "share". I have not 
tried, but seems accurate to say a "shared" file handle among all 
processes is right up open(2)'s ally. It might (_might_) even expose 
some potential to be exploited in more interesting ways. I don't know 
anything about a file handle descriptor. But something inside of the 
system support for Perl's 'fork' is a mechanism by which newly minted 
child processes given one.

II. The Hunt

Motivated by [0], [1], and the example above; the challenge can be 
stated simply:

What hidden opportunities are there in perl/Perl RIGHT NOW that can be 
used TODAY as a basis for exploiting the OS for doing interesting things 
like SMP or introducing perlish atomic semantics? In addition taking 
full advantage of how well perl/Perl understands the unix system model 
and underlying operating system, what language features ideas can this 
generate for serious consideration. Maybe if we can't have 'real SMP', 
perhaps we can explore semantics that make it more nature to use fork, 
on a high level, in the same way.

Here is a list of "challenges" I think would be useful, either as actual 
questions to answer - or as prompts for "thought exercises" or p5 fanfic:

1. what additional syscalls can be exploited (legally abused) for 
atomicity or more interesting things?

2. how can sysopen be used to implement a construct for us by child 
processes in a fork context, such as one that presents a forked 
environment in which the parent also participates (keywords/names are 
just for clarity, I care only about semantics):

   my $MAX_PROCS = 8;
   do_all($MAX_PROCS) {
     # wait barrier for all children based on sysopen
     critical {
       # do stuff with the guarantee that no other in the family of
       # processes are executing this code
     }
     while_waiting(maxattempts => 10, bail_msg => q{I'm out. Something 
wrong.}) {

     }
   }

I am not going to do it, but I am 100% sure that can be implemented 
right now in babby perl using just fork, sysopen, alarm, and die.

It also looks suspiciously like the semantics of try/catch (hmmm). What 
is the win? Semantics that allow Perl to "get out of the way" (finally) 
when doing things in a multi-process way.

3. are there opportunities to provide child processes "shared" things, 
NOW? I go back to the note about 'fork' above:

   File descriptors (and sometimes locks on those descriptors) are
   shared, while everything else is copied. [2]

While the following can't be done in babby perl, I think it can be done; 
albeit there is danger here. But babies should not be doing this:

   # parent perl process
   my %some_hash_of_cool_stuff = ( a => 3, s => 'yes', l => 'LA' );
   my $hash_ref_of_uncool_stuff = { J => 42, m => 12, j => sin(180) };
   my $what_I_had_for_lunch => q{pineapple pizza};
   my @mbox
   open my $shared_fh, q{>}, q{/tmp/a-file} or die;
   my $MAX_PROCS = 8;

   do_all($MAX_PROCS, shared => [\%some_hash_of_cool_stuff, 
$hash_ref_of_uncool_stuff, $what_I_had_for_lunch]) {
      my ($cool_ref, $uncool_ref, $lunch) = shared_refs; # lunch is a 
scalar ref

      my $fid = ${^FORK_ID}; # we are forking, right? parent gets id 0

      # S A F E - reads on "shared" things
      local $cool = $%some_hash_of_cool_stuff;

      # U N S A F E (and warns akin to "Deep recursion scariness")
      if ($fi == $MAX_PROCS - 1) {
        # the last born are always the most entitled, so:
        $$lunch = q{YOU had pineapple pizza, you gave me cold oatmeal. 
You're not even my read dad!!};
      }
   }

What are we looking at above?

a. a "do_all" that creates a set of forked processes, including parent 
that do the same stuff
b. in addition to $$, there is a "${^FORK_ID}" that is in the range, 
0..$MAX_PROCS-1
c. list of references to share to the children on fork; I figure if 
"open" can share a file descriptor, it can share a pointer to perl data 
structures

I've tried my hardest to do this what I considered to be the salient 
restrictions. Note, I don't mention "SMP" but I have tried to use what I 
think it'd look like semantically (today's forks are tomorrows threads). 
In addition to that, I've snuck in the potential for SMP communication 
over the set of shared points.

OMG, does this expose some "safety" issues? Yes, and I love it.

4. Is there a hidden opportunity with "open"? In particular, if all 
children (NOW) get to share a file descriptor, does this apply to ... oh 
yes it does. Sorry to stop mid-thought, but low and behold in [2], under 
the section "Opening a filehandle into a command", are these very 
interesting words:

   If you open a pipe on the command - (that is, specify either |- or -|
   with the one- or two-argument forms of open), an implicit fork is
   done, so open returns twice: in the parent process it returns the pid
   of the child process, and in the child process it returns (a defined)
   0. Use defined($pid) or // to determine whether the open was
   successful.

And this leads to my fifth and final question in today's installment;

5. How can perl expose fundamental unix IPC, in the context of "fork" 
using perlish semantics (get outta the way, DWIM, etc)? There are lots 
of exciting things I can think of with this one. And that's even before 
I throw out the words "object serialization, wireline protocol, or 
redis". But ultimate those things are not "shared" memory. They are cool 
and we should provide semantics for this, but that's more in the realm 
of data brokers and comm channels. But it's interesting and exciting, 
and totally doable in perlish ways.

Above I have expressed some ideas and examples. Some are possible NOW. 
Some not far off.

The essential elements of what I trying to get can be summed up as this:

* perl/Perl has riches yet to be discovered; we likely have all we need 
now; if not the little remaining is not going to stand in the way
* still more, things that can be used to extend the language 
semantically and that are useful in practice
* let's go out and find them, put them together in perlish ways, and get 
excited doing it
* it really is about exposing the unix os in perlish ways; ultimately 
this is where my borderline obsession with all of this comes from

Finally, given the above and the fun I've had in going on this Perl 
safari; I will concede to not using the term "threads". It would be 
sufficient to marry perl semantics using fork and the rich and diverse 
toolkit already in perl and at arms reach in the OS userland, kernel. If 
successful, it won't matter what backs the lines of execution; they 
could be threads; more practically they are forks. I'm ok with that part 
of it (for now).

Refs:

0. https://metacpan.org/pod/distribution/perlsecret/lib/perlsecret.pod
1. https://www.perl.com/article/the-history-of-the-schwartzian-transform/
2. https://perldoc.perl.org/functions/fork
3. https://perldoc.perl.org/functions/open



nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About