develooper Front page | perl.perl5.porters | Postings from August 2012

Re: [PATCH] Module::CoreList delta support

Thread Previous | Thread Next
From:
David Leadbeater
Date:
August 4, 2012 14:29
Subject:
Re: [PATCH] Module::CoreList delta support
Message ID:
CAP9KPhD5UtF2XKkX+ATBRoiqC3vn28DAe09c7hYBYH5h7+qeiQ@mail.gmail.com
On 4 August 2012 21:29, Aristotle Pagaltzis <pagaltzis@gmx.de> wrote:
[snip details of space separated table format]
> But the vast expanses of whitespace gzip-compress to peanuts (<25Kb).
> Do we have gunzip in core?

Yes, see the approach BinGOs mentioned, code here:
https://github.com/bingos/module-snorelist/blob/master/Data.PL

(Note this is uuencoding the gzipped data, so a tiny bit larger, but
I'm not sure that's really needed.)

[snip details of searching data file]
> So we’re looking at <1MB in memory (incl. all overheads), a pittance on
> disk, near zero load time, most parsing work done in a few heavy-weight
> builtins with almost no looping in Perl code, and equally fast access to
> the data by either axis, with no spin-up key index generation for either
> of them.

Putting a tie interface on top of that might not be that nice. I do
wonder if the next step is to make a cleaner API and deprecate the
access via hash "API" as mentioned earlier in the thread.

> Will a patch be accepted if I try this and find the results live up to
> the promise? Did I miss any reason why this is a bad idea?

Sounds sane in general (not my call to accept patches though). With
this approach I assume generation will be involved somewhere and it's
worth thinking about that aspect:

* Where will the generation will take place?
  (First thought is a .PL file run as part of the core perl build
process and Module-CoreList build process -- in that case we wouldn't
reduce the size of those distributions, but may reduce the size of
built packages)
* What the diff for a release manager will look like
  (Presumably the data can be in a nicer format than a table with very
long lines if it's generated from something else).
* How this fits in with scripts in Porting, etc.
  (e.g. test_porting at the moment will act as a sanity test on
CoreList.pm due to the strict version check there, may be worth having
an explicit test for the data consistency).

David

Thread Previous | Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About