develooper Front page | perl.module.build | Postings from December 2012

Re: How To Build A Perl Package Database

Thread Previous | Thread Next
From:
Leon Timmermans
Date:
December 16, 2012 19:58
Subject:
Re: How To Build A Perl Package Database
Message ID:
CAHhgV8jTocJOEBNzG-kcVT+WrJaujwWJOhhFEWcjXMd7C20BHQ@mail.gmail.com
On Sat, Dec 15, 2012 at 11:59 PM, Michael G Schwern <schwern@pobox.com> wrote:
> We have a lot of serious problems because we lack a database of installed
> distributions, releases and files.  There are serious problems with
> implementing one given A) the limitations of the standard Perl install and B)
> wedging it into existing systems.  But I think I have a solution.  Its similar
> to how meta data was slipped into the ecosystem without requiring authors to
> rewrite their releases or install a bunch of extra modules.  It just happens
> as part of the normal CPAN module upgrade process.
>
> I've been thinking that a minimal package database could be created by putting
> some hooks into ExtUtils::Install::install(), which every Perl build system
> ultimately uses, to record what gets installed.  That way when
> ExtUtils::Install is upgraded, the user gets a build database without
> upgrading everything else.
>
> This would be a fairly straight forward process at install time...
>
> 1) Copy everything to a temp directory
> 2) Record everything in that temp directory
> 3) Copy everything from temp into the real location
>
> You could probably optimize this by skipping the copy to temp and just have
> install() record stuff as it goes by, but this is the dumb, simple, robust way
> to do it.
>
> Storage is a problem.  The only reliable "database" Perl ships with is DBM, an
> on disk hash, so we can't get too fancy.  It might take several DBM files, but
> this is enough to record information and do simple queries.  What are those
> queries?
>
> * What version of the database is this?
> * What distributions are installed?
> * What release of a distribution is installed?
> * What files are in that release?
> * What version is that release?
> * What location was a release installed into? (core, vendor, site, custom)
> * What are the checksums of those files?
>
> And the basic operations we need to support.
>
> * Add a release (ie. install).
> * Delete a release (and its files).
> * Delete an older version of a release (as part of install).
> * Delete an older version of a release, only if its in the same release
>   location.  This is so CPAN installs don't delete vendor installed modules.
> * Verify the files of a release.
> * List distributions/releases installed.
>
> It would also store the MYMETA data which gives us a lot of information (such
> as dependencies) for free.

I can agree with all of that. Actually, starting a discussion about
this was on my todo-list for the last QA hackathon but I didn't get
around to it. Ideally, it should replace not only packlists but also
perllocal

> This is all totally doable, and efficient enough, with a small pile of DBM
> files and Storable.  Where to put the database is a bit more complicated, see
> the list of open problems below.

Given that Storable's format isn't forward-compatible, something more
stable such as JSON would be more appropriate.

> There's lots and lots and lots of additional information which could be stored
> and queries and operations to allow, but if we can get the basics working
> it'll allow a heap of new solutions.  And I think this is a SMOP.
>
>
> Future possibilities include...
>
> * Auto-upgrade to SQLite if ExtUtils::Install::DB::SQLite is installed.
>
> If a special module is installed we can offer SQLite support (or whatever) for
> a more advanced database.  At install time it would copy the existing DBM
> system into its own database.
>
> In general, more functionality can be added as more optional (or bundled)
> dependencies are available to the system.  Through it all the basic DBM
> database would continue to be redundantly maintained to provide a fallback
> should those optional modules break or go away.

Having a proper database would be really nice, but I'm not sure if
it's going to be worth the hassle if we have a robust system already.

> * Upgrading the database.
>
> I'd like to put some thought into how things are laid out initially to avoid a
> lot of major revisions, and thought into what information should be recorded
> so its available later, but eventually we're going to want to change the
> "schema", such as it is with DBM files.
>
> I figure this can happen as part of upgrading ExtUtils::Install.  It checks
> what version of the database you have and performs the necessary transforms to
> bring it up to the current version.  We know how to do this, just have to keep
> it in mind and remember to implement it.
>
> * Where to put the database?  What about non-standard install locations?
>
> $Config{archlib} would seem the obvious location, but it presents a
> permissions problem.  If a non-root user installs into their home
> directory, you don't want them needing root to write to the installation
> database.  There's several ways to deal with this.
>
> One is to simply not record non-standard install locations, but this loses
> data and punishes all those local::lib users out there.
>
> Another is to have a separate install database for non-standard install
> locations.  This makes sense to me, but it brings in the sticky problem
> of having to merge install databases.  Sticky, but still a SMOP.  Once you
> have to implement merging anyway, it now makes sense to have an install
> database for each install location.  One for core.  One for vendor.  One for
> perl.  And one for each custom location.  This has a lot of advantages to
> better fit how Perl layers module installs.
>
>     * allows separation of permissions
>     * allows queries of what's installed based on what's in @INC
>
> That second one is important.  When a normal user queries the database, they
> want to get what's installed in the standard library location.  When a
> local::lib user queries the database, they want to get what's installed in the
> standard library locations AND their own local lib.

The combination of these is problematic. You might upgrade EU::Install
in your local module path, but not have write permissions on the
system paths. In practice, we might have to support all our older
versions :-|

> Not perfect, but gets us off the ground.  Its not a great database, but it
> does the important job of recording the critical install-time data for later
> use.  Its implementable within the current system.  It doesn't require a bunch
> of dependencies, just one upgrade.  It works with most existing module
> releases.  It solves a major design problem with the Perl module system.
>
> I think it's a Simple(?!) Matter Of Programming in ExtUtils::Install to get it
> off the ground.  IMO the most important bit of coordination is putting some
> thought into what the basic database should look like so we don't have to
> worry about complicated upgrades later.

I'm not sure it's as simple as you make it sound, but it is a good
idea nonetheless.

Leon

Thread Previous | Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About