develooper Front page | perl.perl5.porters | Postings from July 2007

Fear Branching vs Embrace Branching.

Thread Next
Michael G Schwern
July 12, 2007 02:53
Fear Branching vs Embrace Branching.
Message ID:
Reading through all the conversations here I've noticed a definite split into
two camps.  And its not SVN vs git.  Its more profound then just what VCS you
prefer.  Its what development model you prefer.  It boils down to this:  Is
branching something to be feared or embraced?  Which camp you're in largely
comes from what style of version control you're used to: traditional or

To the user of traditional version control systems -- RCS, CVS, Perforce and
SVN -- branching is something to be feared.  Or, as Jesse rightly corrected
me, *merging* is something to be feared.  SVN made branching easy as pie but
didn't deal with merging (something I hear they're working on in 1.5).
Branches are necessary evils.  They are short lived and closed before they
drift too far from the trunk to make the merge a nightmare.  Everyone must
commit to the trunk and there is a single master version.

To the user of the new breed of distributed version control systems -- darcs,
arch, git, SVK... pretty much every new VCS since Subversion -- branching and
merging is something to be embraced.  These new systems looked at the problems
of branching and merging and tackled it head on in order to decentralize the
development process.  Decentralization does not necessarily mean "everyone has
their own toy repository and releases their own toy distribution and oh god
what a coordination nightmare" which is the usual read when people first hear
about distributed version control.  Rather it means that not everything has to
happen in the trunk, in one great brittle revision.

But this model is not chosen because its a good thing, its chosen because the
tools make the alternative so horrid.  Its been going on for so long that for
most it has been drilled into their heads that branching is to be feared
because branching is bad.  But branching is not bad, the tools are weak.

Think about any other aspect of your software development model and the drive
is towards decentralization.  We encapsulate our code into separate modules
and then deal with the integration issues, because its better than the
alternative of shoving everything into one file.  Server management?  We have
per-developer sandbox servers, testing servers, staging servers and production
servers.  Any production house that pushed their changes directly into
production is laughed at.  Software design?  There's no longer a single
architect writing software design documents for mere programmers to implement,
its a collaborative, iterative process broken into chunks and spread out
amongst the whole team.

Let's talk about the great buggaboo of traditional version control: keeping a
branch up to date with trunk.  You make a branch.  You commit some work.
Meanwhile, people are committing to trunk.  You want to update your branch
with those changes, how?  And once you've updated your branch once, what if
you want to do it again later?  Traditional version control do a very poor job
of this requiring the user to do a lot of the bookkeeping themselves.  It
rapidly gets messy and its very easy to completely hose the branch or generate
piles and piles of spurious conflicts.

For distributed version control, this sort of thing is their bread and butter.
 "pulling" the latest changes from trunk into a branch is generally one simple
command.  "pushing" your branch's changes back into trunk is as well.  And you
can do this as many times as you like, the VCS handles all the book keeping
for you.  Maintaining a branch is no longer a perilous affair.  And that
allows you to embrace, rather than fear, branching.

In either world, the developer can work in their own sandbox and then, when
they're ready, push their changes back into the trunk.  In the traditional
model, the sandbox is your checkout and the push is simply a commit.  The
trouble there is you can't commit until the whole task is done.  If its a
large change, that means it enter trunk as one giant patch.  Difficult to review.

In the distributed model, your branch is your sandbox.  You can commit as
often as you like, when you like.  Do some work, test, log, commit, repeat.
This lets you do the work in small, easily reviewed chunks showing the
progression of what you're working on.  Need to rename a subroutine?  Make
that change and commit it.  Need to move some code?  Make it, commit it.  Fix
a small bug you uncovered?  Fix it, commit it.  Then when you're all done your
changes can be examined a piece at a time, easy to review.

This also allows the developer to work in small, iterative steps.  Do a
refactoring.  Then run the tests, ok they all pass.  Commit.  Add a new
function.  Run the tests... wait, they failed.  Well, you know everything was
working at the last commit, so it must be something you did in the current
diff!  This vastly narrows down the possibilities when debugging.

Let's say you want someone to review your change before you commit?  In a
traditional system this means mailing a patch off to a mailing list, something
YOU have to do manually.  What if you update something?  Mail off a new patch.
 What if you change something and forget to resend a new patch?  Well now
everyone's looking at an old patch, pretty common mistake.  And, of course, if
its a large patch its difficult to follow what's changed and why.

But if I'm working in a branch that's checked into the master repo (not purely
distributed), anyone at any time can look at the work.  They can see the
complete history, logs and individual diffs.  There's no need to manually mail
out patches.  You can even catch when some work was forgotten by looking to
see what branches contain differences against truck.

So that's what it boils down to.  We've been trained by our tools for decades
to fear branching.  This has colored our development models and made them
overly centralized.  In just the last few years a whole new set of tools has
made branching easier so that we can embrace branching.  And now, finally, the
limitations of our version control system need no longer dictate our
development model.

Thread Next Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at | Group listing | About