develooper Front page | perl.perl5.porters | Postings from February 2009

Re: understanding merge history

Thread Previous | Thread Next
Dave Mitchell
February 2, 2009 09:16
Re: understanding merge history
Message ID:
On Thu, Jan 29, 2009 at 10:50:16AM +1300, Sam Vilain wrote:
> On Wed, 2009-01-28 at 09:53 +0100, Rafael Garcia-Suarez wrote: 
> > It has two ordering options : --topo-order (the default) and
> > --date-order. With --topo-order, IIUC, it will list the children
> > commits first, starting with the last branch when there's a merge.
> > That is, for your example : F E D2 D1 C3 C2 C1 B A. (With date-order
> > it will obviously mix C* and D* commits depending on the timestamps)
> These options are somewhat confusingly named.  --date-order is a more
> defined version of --topo-order, where it's first topological and then
> chronological as a tie breaker.  According to the man page, by default
> the commits are shown in reverse chronological order.
> > > Now suppose I want to cherry-pick E for maint. It seems to be that
> > > if I want all of the branch (ie both D1, and D2), then I just do
> > > cherry-pick -m 1 and it applies the diff between C3 and E to maint (ie one
> > > big commit containing D1 and D2).
> > 
> > I don't know, I never cherry-picked a merge... I once reverted a merge,
> > and it was a pain. (revert and cherry-pick are very similar operations
> > in git)
> Cherry-pick is really the wrong tool for dealing with merges.  It only
> works with the differences between two adjacent changes, and does a
> three-way merge of those differences into the current tree.  You can
> specify since git 1.5.4 which parent this is relative to.  But what do
> you want a cherry-picked merge to do?  A four-way merge?

Hmm, let me explain further. With reference to the original chart:

            C1 -> C2 -> C3
          /               \
    A -> B                  E -> F
          \               /
            D1 -> D2 ----/

IIUC, 'git cherry-pick C2' is functionally equivalent to
'diff C1 C2 | patch -', except that it has handy resolving mechanisms for
when the diff doesn't apply cleanly. Similarly:

    cherry-pick -m 1 E      same as     diff C3 E | patch -
    cherry-pick -m 2 E      same as     diff D3 E | patch -

Bearing in mind that the algorithm for a maint pumpkin is:

    1. Generate a list of all commits in bleed since maint was branched;
        (ie ... A B C1 C2 c3 C3 D1 D2 E F ...)

    2. For each commit, decide whether it's suitable for maint, and if so
    cherry-pick it (not necessarily in strict chronological order)

This simple algorithm fails in the presence of merges like E.

One approach is to just skip merge commits like E; so if I decide that
all the C's and D's are suitable for maint, I do (not necessarily all at
the same time or in this order):

    cherry-pick A
    cherry-pick B
    cherry-pick C1
    cherry-pick C2
    cherry-pick C3
    cherry-pick D1
    cherry-pick D2
    cherry-pick F

There are two problems with this. First, the D1 and D2 picks are likely to
have conflicts, and I have to resolve, repeating work (possibly
erroneously) already done by E's committer.

But more seriously, it discards any hacks that were included in E; for example
C3 may have introduced 'int i' into a function, D2 introduced 'int j', and
E's committer, while resolving, realised that only one var was needed for
both, and renamed j to i in the D2 chunks. The cherry-picks above will just
completely miss that.

On the other hand, if instead I do

    cherry-pick A
    cherry-pick B
    cherry-pick C1
    cherry-pick C2
    cherry-pick C3
    cherry-pick -m 1 E
    cherry-pick F

Then it all works, is completely safe, and I don't have to repeat the
merge resolutions. But I miss out on all the detail of the D branch - it
all gets added as one bit pick.

> If you want to perform a merge and re-use merge resolutions from another
> commit, which is about all the sense I can possibly make from the desire
> to cherry pick a merge, then the program to use is called git-rerere.
> It stores resolution information which is useful for avoiding repeating
> merges; I don't know if it can just be pointed at an arbitrary merge and
> update its resolution cache from that, but if it's not you can almost
> certainly fool it into doing so, with appropriate use of git-reset and
> so on.

I don't think that will help much in this scenario.

> > > This seems ok as long as the D branch is small and I want all of it.
> > > However, what happens if I only want some of the Ds added to maint?
> > > Or if I want all the D's, but as individual commits (for better bisecting
> > > when things go wrong, for example)?
> > 
> > I would use git log (or git rev-list) D1..D2 to get the list of SHA1 in
> > the branch, and cherry-pick them. There might be some cleverer things to
> > do, however.

Which suffers from the 'missing E hacks' I described earlier.
> Yes.  Or you can make a new branch at the end of the D line and use git
> rebase -i ... (it takes other arguments to specify where you want to
> rebase to) to interactively select which commits you want to try to
> rebase to the new place.


The optimist believes that he lives in the best of all possible worlds.
As does the pessimist.

Thread Previous | Thread Next Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at | Group listing | About