Front page | perl.perl5.porters |
Postings from August 2009
Re: The plan for ext/ and dual-life modules
From: Yuval Kogman
August 29, 2009 17:04
Re: The plan for ext/ and dual-life modules
Message ID: firstname.lastname@example.org
2009/8/30 demerphq <email@example.com>:
> That doesnt make me feel all warm and fuzzy frankly. Submodules to me
> still seem to be too unstable for us.
Not really, the implementation is very simple. Submodules aren't
unstable in any way, they just have a problem of expectation
The problem is that people usually expect to find something like SVN
externals, which will let you update more easily, whereas submodules
require a commit in the repository that includes them in order to be
updated. In our case this is very appropriate.
People then blog in anger that submodules have "problems", just
because the name "submodules" sounded like it would be appropriate for
their large, multi repository deployment issues, and it turns out it
> I mean, we have a mostly git-unaware/novice user base, submodules seem
> to fox experienced users regularly. Using them for our community seems
> to me to be asking for trouble.
I think the opposite is true. The difficulty is in setting up
submodules (though that isn't really complicated either). Once they
are in place I think they would drastically *reduce* the chance of
The process for updating a module is quite clear:
This is pretty much the same as what you would need to run to generate
the patch, namely to overwrite all the files in dual/Foo with the
files from the upstream, and create a commit in perl.git to that
effect, and send it for review.
rm -rf dual/Foo
tar -zxf Foo-0.123.tar.gz; # or some other snapshot
mv Foo-0.123 Foo
git add Foo
that seems a lot simpler and safer to me.
In both cases we end up with a git commit that can be sent as a patch
and that will update the dual lifed module to the new version.
> Beside that they are cool and froody, whats the advantage? Especially
> given that our Dual-lifed modules arent necessary vc-ed under git
> anyway, whats the advantage of a submodule /at all/ in our case?
They let us treat dual lifed modules as separate subprojects (which
they are): instead of creating "update Foo to CPAN version" patches
periodically, that becomes an "update Foo to commit 0123decafbad from
The point is to allow dual lifed modules' maintainers who want to use
git to a more reliable way of sending updates downstream and receiving
patches from downstream, with the added side effects that the job for
a perl.git committer is slightly easier.
When the patching process is manual there's always additional room for errors.
The fact that dual/Foo is just a git clone to the upstream makes full
history is available, so it's much easier to backport patches written
against core into the CPAN versions. It's just a trivial case of 'git
merge' to bring that patch up to date.
See below for a more in depth explanation of what is inferior with
doing this manually.
> Reread that sentence about 10 times. Then ask yourself if you really
> think submodules are a good idea for us (now).
That's not really what I'm suggesting, I was just making the point
that it's technical possible.
I think a much better process would be for me to be able to push to
> Especially try to think of what would be involved explaining how to do
> this to the myriad CPAN owners.
It can and should be completely opt in.
There's no need to force everyone to do that, but if the author
maintainer already maintains a .git repository for their module this
workflow is precisely what submodules were designed to make possible.
I would benefit from this because even for a trivial module like
Tie::RefHash dual I still dealt with a world of pain accepting patches
written against core. I honestly have no idea what the current state
of integration is, either. I'd need to compare the .pm file manually
to find out.
To backport patches I apply them to a branch that is restructured to
look like Perl core (lib/Tie/RefHash.pm and lib/Tie/RefHash/*.t ) and
then I manually merge/cherry-pick so that they get backported to the
normal layout. To create a patch to update perl core I merge my master
branch into the Perl core-like branch and manually figure out what the
last sync point was, apply that diff to perl.git and then send a new
patch for that (though actually I think the last time I still used
The new layout solves the major difficulties with this patch juggling,
but in the event that an author has written code that has not made it
into core yet, and then receives a patch against core there is an
implicit branching point. The proper way to apply a patch written
against perl core as a dual life module maintainer would be:
git checkout core-integration # go back to the last integration point
patch -p2 < patch # hopefully this goes cleanly
git checkout master # go back to our local edits
git merge core-integration # or rebase
meaning the module maintainer has to manually locate the correct point
in the module's history and attach the patch to that point in order to
model the divergence in a way that will allow their local changes to
be pushed back downstream cleanly.
To summarize, with submodules:
1. the state of synchronization is known, it doesn't rely on
humans documenting it (e.g. dates, or revision IDs in perl.git's
changelogs to help the maintainer)
2. creating a patch to bring perl.git up to date is very simple,
even though it requires learning that it's a submodule it's not hard
and again, it's opt-in
3. applying patches written for downstream to Foo.git is much much
simpler, removing the major difficulties for a dual life module
That doesn't mean that everyone needs to learn how to use submodules,
it's only the people that want to avoid going through this manual