Front page | perl.qa |
Postings from December 2011
Re: Need suggestions for terminology
From: David Golden
December 2, 2011 19:48
Re: Need suggestions for terminology
Message ID: CAOeq1c9EbZECyi6xkqvX44L5VfMmS33qgtnMcpvHVjJNDONDZg@mail.gmail.com
On Fri, Dec 2, 2011 at 8:21 PM, Jeffrey Thalhammer
> Hi everyone-
> I need some suggestions for terminology to use in my code and documentation. I'm picky about names, so this is important to me (perhaps more than it should be). The context is Pinto, which is yet-another suite of libraries and tools for building a private CPAN-like repository. Here's what I have so far...
I would encourage you to use existing names/conventions whenever
possible. Some references to consider if you haven't.
https://metacpan.org/module/CPAN::DistnameInfo (which you've seen)
> Distribution: A Distribution is an abstract concept that defines relationships between packages. The minimal concrete implementation of a Distribution would be just a META.json (or equivalent) file. Distributions also have names and versions like Foo-Bar-1.2
No. At best a distribution is a collection of zero or more modules.
(It could be all scripts and no modules.) META is not required.
> Distribution Archive: A Distribution Archive is the physical manifestation of a Distribution, and corresponds to an actual file on the local disk. For example, /home/jeff/Foo-Bar-1.2.tar.gz or C:\MyDocuments\Foo-Bar-1.2.tar.gz
I don't think you gain much by distinguishing this from distribution.
If you need to, I would consider "abstract distribution" for the
concept of a collection of modules and "distribution" for the archive
file as that is how it's commonly referred to elsewhere.
> Distribution Path: The Distribution Path is how an Archive is identified in a CPAN index. It is basically a URL fragment that looks like A/AU/AUTHOR/Foo-Bar-1.2.tar.gz. This is the term I'm having the most trouble with. CPAN::DistnameInfo calls this the "prefix" but I don't really like that either.
A distribution file can only be uniquely identified on CPAN by
AUTHOR/Foo-Bar-1.23.tar.gz. This is why I think that separating the
concept of distribution from path is problematic. If you define
distribution to be the tarball, the the name of the distribution is
AUTHOR/Foo-Bar... etc. (The "A/AU" is unnecessary as it can be
It's what rjbs and I called a cpan::distfile URI.
If you need, you might discuss "distribution name" as the bundle that
may span multiple releasing authors, but with the huge caveat that two
distribution files with the same "name" (sans author) have anything to
do with each other.
These might be the same inside or they might be different. My Foo
distribution could contain module ACME::Foo and yours might contain
Acme::FOO. Those would be two totally legal, indexable distributions
as far as PAUSE is concerned.
Likewise "distvname" from DistnameInfo is a useful concept, but not
sufficient to identify uniqueness.
My general conclusion is what I said in my blog post. A
"distribution" is an archive file 'AUTHOR/Tarball-version.suffix' that
contains 0 or more modules. That is the only definition that doesn't
get you into trouble with edge cases.
> Package: A package is just a package, in the usual way. That is, something declared with the "package" keyword.
> Module: I actually avoid using the term Module because I think it is often misused.
Agreed. See my blog post about it. A "Module" is something that you
can give to "use" or "require" and get a thing loaded. It may contain
zero or more Packages (i.e. namespaces)
> Repository: A repository is a general term for any CPAN-like pile of files. This includes CPAN mirrors, as well as any DarkPAN or mini-cpans. A repository has a URL that identifies the entry point. For example: http://cpan.perl.org
Fine, though you imply but don't state that a repository contains
distributions in a particular directory layout below the base URL.