develooper Front page | perl.par | Postings from September 2008

Re: PAR and modules, accessing the lib directory

Thread Previous | Thread Next
From:
Steffen Mueller
Date:
September 2, 2008 05:33
Subject:
Re: PAR and modules, accessing the lib directory
Message ID:
48BD3270.6010700@sneakemail.com
Hi Alvar,

one problem with having many different modes of extraction I have been
worrying about was that given the tentative three modes, how can you use
multiple in the same application:
- some modules need to be able to scan @INC, so a full extraction is
necessary
- you generally might want fast loading from memory without temporaries
- you want on demand extraction for stuff that needs to be able to read
its own code, but not necessarily scan @INC

Alvar Freude wrote:
>> That depends on how you use it. If you use it with .par files, or with
>> standalone exes *AND* --clean, then you're right.
> 
> in my case with .par files, yes.

So I just thought: Why not set the desired extraction mode at creation
time and put it into the META.yml of the .par file? It's not an innocent
little change, as it not only needs *even more* pp options, won't work
automatically, and needs extra book-keeping during run-time usage. But
it might be worth the effort, I'm not sure.

>> The main problem with changing the behaviour is that nobody has the time
>> to do it.
> 
> depending on how much it is, I can help. But I have no real knowledge of 
> all the PAR internals!

There is already code for extracting everything. It's triggered for all
executables that were not built using the --clean option. For that
branch, there's again two ways it can happen: Either with Archive::Zip
(slow) or with Archive::Unzip::Burst (fast, but only works on linux
because I suck, patches for other OS's welcome).

Executables built with --clean and .par files both have their own logic
for extraction, IIRC.

Next thing is: When or if somebody starts refactoring all that logic,
Scott's load-from-memory approach should be kept in mind so his branch
doesn't become entirely incompatible and un-mergeable.

If you want to have a go, I'd suggest looking at the code a bit and
seeing whether it makes any sense to you at all. There's a bunch of
experienced people on this list who usually try to help.

> one solution may be to have the possibility to extract some paths or 
> files; e.g. PAR::extract_path("path/in/the/par-files")

You can get the Archive::Zip handle of a .par file easily. From there
on, it's just a question of whether A::Zip supports extraction of
sub-paths. I'd think it does.

>> Then again, switching to burst-extracting all loaded .par files may be a
>> bad idea in the advertised "use PAR '/foo/bar/*.par';" chainsaw
>> approach. It implicitly assumes that opening a bunch of .par files at
>> startup isn't prohibitively expensive.
> 
> yes, but it depends on the application if this is a problem. I have an 
> application which spends a lot more time in other things, so if it takes 
> one second more or less, it doesn't matter ;-)

A general change in behaviour must be the best solution for the broadest
user base, though.

> A new PAR::ExtractAll (or something like that) module unpacks everything 
> into a temp dir (File::Temp), which gets cleaned after the run, or as 
> alternative into a given directory. The lib directory/directories of the 
> are added to @INC and it works.

Having something akin to a subclass of PAR changing just a small bit of
behaviour might be more difficult than you think. Copying the source and
changing it would probably be easy, but a nightmare.

The general logic should be there already, see above.
I'd love to get some input on the
"extraction-method-controlled-by-switch-in-the.par" idea from the others
on the list. Supposing it's considered a good idea, what would need to
be done? Brainstorming:

- Add code to read the META.yml when the .par is loaded
--> see get_meta in PAR::Dist
--> will require a YAML loader, which isn't currently required by PAR or
PAR::Dist.
--> I committed an experimental change to PAR::Dist which adds a sub
that tries to load any known YAML loader/dumper and returns sub refs.
But for the PAR use case, there would absolutely have to be a dependency
on a YAML implementation. Specifying "you must have any of YAML,
YAML::Syck, YAML::Tiny, YAML::XS or potentially just Parse::CPAN::Meta"
in the Makefile.PL isn't possibly, afaik. And I'd really, really dislike
adding an explicit dependency on any of those.

- Add a flag to META.yml to indicate the desired extraction mode
--> What are the supported modes?

- Refactor the extraction code in PAR, par.pl and potentially even the
par loader to make selection of extraction mode easier.
--> Yuck.

- If the flag's present in META.yml, use it, otherwise, default to the
current behaviour.
--> What's the current behaviour considering all edge cases?

- Add a way to specify the extraction mode on creation of .par files
like the --clean option to pp does for executables.
--> The mechanism should, if at all possible, be as generic as possible,
so it's easy to add to the myriads of tools that can spit out .pars.

- Add the book-keeping to PAR.pm for remembering which .pars require
which logic.
--> mind the @INC hooks! Yuck!

I bet I forgot something crucial.

>> [Scott Stanton's fastio branch]
> I read about this, this sounds very cool. And for a lot of usages it may 
> be a good choice.

Yes, but how do you make the software decide when it is?

Best regards,
Steffen

Thread Previous | Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About