develooper Front page | perl.perl5.porters | Postings from May 2023

Re: PTS2023 - Multiple Indices Discussion

Thread Previous
From:
book
Date:
May 2, 2023 02:17
Subject:
Re: PTS2023 - Multiple Indices Discussion
Message ID:
ZFBypM/pS7rBpiNo@kapow.home
On Sun, Apr 30, 2023 at 04:04:43PM +0100, Paul "LeoNerd" Evans wrote:
> (A writeup of a discussion we held at PTS2023)
> 
> ## The Problem To Be Solved
> 
> **Primary usecase**: Authors who want to use new perl syntax/features
> in new versions of their modules, without breaking "older perl".
> Ideally, allow older perl versions to install prior versions of said
> module.
> 
> *For example*: Example::Module version 2.5 works on all perl versions.
> Author wants to use some new features that appear in 5.36. So, author
> bumps version number up to 3.000, sets required perl version to 5.036,
> uploads to PAUSE. Ideally, users on older perl versions will still see
> version 2.5 if they try to install it.
> 
> Secondary usecase: Authors who, having uploaded a newer version of a
> module that now has a later minimum perl version requirement, wish to
> still release bugfixes for older perl versions. These could be
> described as *non-monotonic* releases; releases whose version number is
> not a monotonic increase on the previous version.
> 
> *For example*: The above author now wishes to fix a bug in
> Example::Module version 2.5, so uploads version 2.6. This becomes
> available for the older perl versions, while still leaving 3.0
> available for those on perl 5.36 or later.

I've been in the same discussion, and have started to work on part of
a solution. I believe the first part is less complex than made out in
Paul's email, and it can be used to build the second part.

## The PAUSE index

The PAUSE index is keyed by module/package, and points to the CPAN
distribution that provides it. An optional version of the module is
listed in the data, but each package only shows up once per index.

What makes the PAUSE index special is that *it reflects the state of
permissions at the moment of the upload of the distribution*.

When using CPAN.pm to install a *package*, the index is used to map the
requested module/package to a distribution. CPAN.pm installs the
corresponding *distribution*, installing all other packages in it as a
side-effect.

If a package is listed in the index, the author of the distribution had
the proper permissions on it at the time of the distribution upload.

PAUSE *does not* index non-monotonic releases. The distribution file is
available on CPAN, but the PAUSE indexer ignores it completely. Anything
that uses the PAUSE index will never know it was ever uploaded.


## Building an index for older Perl versions

Since the PAUSE indexer ignores non-monotonic releases, I'm going to
ignore them in the first part of this email.

> It is considered infeasible to get PAUSE alone to solve this. We can't
> just generate multiple 02packages-VER.txt files, for example.

Let's assume PAUSE had been generating a 02packages-VER.txt for each
published Perl version (at the time of distribution upload) for years. How
would that work?

I think it would go like this for every upload:

1. get a minimum supported Perl version (MIN) from the
   distribution's META file (set MIN = 0 if there no data)
2. add a line for each authorized package in the distribution (i.e. the
   distribution author has permissions over that package) to all indices
   where VER >= MIN

With that simple hypothetical model, we would have 02packages-VER.txt
files nowadays.


During the Perl Toolchain Summit, I started working on building
this (a packages index per perl version), using the historical
02packages.details.txt information which is published in a git repository
going back to 2012.

The repository is available at https://github.com/batchpause/PAUSE-git
(cloning it takes a long time...)

The idea is to seed each per-version index with the first index in that
repository and, as new releases are indexed by PAUSE and show up in the
historical data repository, only add them to the per version index when
the distribution's metadata says they work with that version of Perl.
(as described above)

This should give us what we want for monotonic releases. Basically,
the index for a given version of Perl will stop getting updates for a
distribution when the metadata says the newer versions don't support
it anymore.

The git repository is updated every half hour, so the per-version
indices will always lag behind the PAUSE index by at least that
much time.

I think this covers the first part of Paul's email.

We can rebuild a per-version set of indices, just from the historical
02packages.details.txt data, without the need for overlays.

> ## Supporting Non-Monotonic Releases

I think for the second part, we can patch the indices described above.

The main issue is making sure that the changes are authoritative, per
the PAUSE model. The META file points to a mapping file, and by virtue
of having been indexed by PAUSE, that mapping file can be trusted
for the modules in the distribution that the author has permission on.
(and only those)

-- 
 Philippe Bruhat (BooK)

 There is no solution to a problem of sheer greed.
                                    (Moral from Groo The Wanderer #94 (Epic))

Thread Previous


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About