GNU Info

Info Node: (cvsbook.info)Basic Concepts

(cvsbook.info)Basic Concepts


Next: A Day With CVS Up: An Overview of CVS
Enter node , (file) or (file)node

Basic Concepts
==============

If you've never used CVS (or any version control system) before, it's
easy to get tripped up by some of its underlying assumptions.  What
seems to cause the most initial confusion about CVS is that it is used
for two apparently unrelated purposes: record keeping and collaboration.
It turns out, however, that these two functions are closely connected.

Record keeping became necessary because people wanted to compare a
program's current state with how it was at some point in the past.  For
example, in the normal course of implementing a new feature, a developer
may bring the program into a thoroughly broken state, where it will
probably remain until the feature is mostly finished.  Unfortunately,
this is just the time when someone usually calls to report a bug in the
last publicly released version.  To debug the problem (which may also
exist in the current version of the sources), the program has to be
brought back to a useable state.

Restoring the state poses no difficulty if the source code history is
kept under CVS.  The developer can simply say, in effect, "Give me the
program as it was three weeks ago", or perhaps "Give me the program as
it was at the time of our last public release".  If you've never had
this kind of convenient access to historical snapshots before, you may
be surprised at how quickly you come to depend on it.  Personally, I
always use revision control on my coding projects now - it's saved me
many times.

To understand what this has to do with facilitating collaboration, we'll
need to take a closer look at the mechanism that CVS provides to help
numerous people work on the same project.  But before we do that, let's
take a look at a mechanism that CVS doesn't provide (or at least,
doesn't encourage): file locking.  If you've used other version control
systems, you may be familiar with the lock-modify-unlock development
model, wherein a developer first obtains exclusive write access (a lock)
to the file to be edited, makes the changes, and then releases the lock
to allow other developers access to the file.  If someone else already
has a lock on the file, they have to "release" it before you can lock it
and start making changes (or, in some implementations, you may "steal"
their lock, but that is often an unpleasant surprise for them and not
good practice!).

This system is workable if the developers know each other, know who's
planning to do what at any given time, and can communicate with each
other quickly if someone cannot work because of access contention.
However, if the developer group becomes too large or too spread out,
dealing with all the locking issues begins to chip away at coding time;
it becomes a constant hassle that can discourage people from getting
real work done.

CVS takes a more mellow approach.  Rather than requiring that developers
coordinate with each other to avoid conflicts, CVS enables developers to
edit simultaneously, assumes the burden of integrating all the changes,
and keeps track of any conflicts.  This process uses the
copy-modify-merge model, which works as follows:

  1. Developer A requests a working copy (a directory tree containing
     the files that make up the project) from CVS.  This is also known
     as "checking out" a working copy, like checking a book out of the
     library.

  2. Developer A edits freely in her working copy.  At the same time,
     other developers may be busy in their own working copies.  Because
     these are all separate copies, there is no interference - it is as
     though all of the developers have their own copy of the same
     library book, and they're all at work scribbling comments in the
     margins or rewriting certain pages independently.

  3. Developer A finishes her changes and commits them into CVS along
     with a "log message", which is a comment explaining the nature and
     purpose of the changes.  This is like informing the library of
     what changes she made to the book and why.  The library then
     incorporates these changes into a "master" copy, where they are
     recorded for all time.

  4. Meanwhile, other developers can have CVS query the library to see
     if the master copy has changed recently.  If it has, CVS
     automatically updates their working copies.  (This part is magical
     and wonderful, and I hope you appreciate it.  Imagine how
     different the world would be if real books worked this way!)


As far as CVS is concerned, all developers on a project are equal.
Deciding when to update or when to commit is largely a matter of
personal preference or project policy.  One common strategy for coding
projects is to always update before commencing work on a major change
and to commit only when the changes are complete and tested so that the
master copy is always in a "runnable" state.

Perhaps you're wondering what happens when developers A and B, each in
their own working copy, make different changes to the same area of text
and then both commit their changes? This is called a "conflict", and
CVS notices it as soon as developer B tries to commit changes.  Instead
of allowing developer B to proceed, CVS announces that it has discovered
a conflict and places conflict markers (easily recognizable textual
flags) at the conflicting location in his copy.  That location also
shows both sets of changes, arranged for easy comparison.  Developer B
must sort it all out and commit a new revision with the conflict
resolved.  Perhaps the two developers will need to talk to each other to
settle the issue.  CVS only alerts the developers that there is a
conflict; it's up to human beings to actually resolve it.

What about the master copy? In official CVS terminology, it is called
the project's repository.  The repository is simply a file tree kept on
a central server.  Without going into too much detail about its
structure (but see Note: Repository Administration), let's look at
what the repository must do to meet the requirements of the
checkout-commit-update cycle.  Consider the following scenario:

  1. Two developers, A and B, check out working copies of a project at
     the same time.  The project is at its starting point - no changes
     have been committed by anyone yet, so all the files are in their
     original, pristine state.

  2. Developer A gets right to work and soon commits her first batch of
     changes.

  3. Meanwhile, developer B watches television.

  4. Developer A, hacking away like there's no tomorrow, commits her
     second batch of changes.  Now, the repository's history contains
     the original files, followed by A's first batch of changes,
     followed by this set of changes.

  5. Meanwhile, developer B plays video games.

  6. Suddenly, developer C joins the project and checks out a working
     copy from the repository.  Developer C's copy reflects A's first
     two sets of changes, because they were already in the repository
     when C checked out her copy.

  7. Developer A, continuing to code as one possessed by spirits,
     completes and commits her third batch of changes.

  8. Finally, blissfully unaware of the recent frenzy of activity,
     developer B decides it's time to start work.  He doesn't bother to
     update his copy; he just commences editing files, some of which
     may be files that A has worked in.  Shortly thereafter, developer
     B commits his first changes.


At this point, one of two things can happen.  If none of the files
edited by developer B have been edited by A, the commit succeeds.
However, if CVS realizes that some of B's files are out of date with
respect to the repository's latest copies, and those files have also
been changed by B in his working copy, CVS informs B that he must do an
update before committing those files.

When developer B runs the update, CVS merges all of A's changes into B's
local copies of the files.  Some of A's work may conflict with B's
uncommitted changes, and some may not.  Those parts that don't are
simply applied to B's copies without further complication, but the
conflicting changes must be resolved by B before being committed.

If developer C does an update now, she'll receive various new changes
from the repository: those from A's third commit, and those from B's
first _successful_ commit (which might really come from B's second
attempt to commit, assuming B's first attempt resulted in B being forced
to resolve conflicts).

In order for CVS to serve up changes, in the correct sequence, to
developers whose working copies may be out of sync by varying degrees,
the repository needs to store all commits since the project's beginning.
In practice, the CVS repository stores them all as successive diffs.
Thus, even for a very old working copy, CVS is able to calculate the
difference between the working copy's files and the current state of the
repository, and is thereby able to bring the working copy up to date
efficiently.  This makes it easy for developers to view the project's
history at any point and to revive even very old working copies.

Although, strictly speaking, the repository could achieve the same
results by other means, in practice, storing diffs is a simple,
intuitive means of implementing the necessary functionality.  The
process has the added benefit that, by using patch appropriately, CVS
can reconstruct any previous state of the file tree and thus bring any
working copy from one state to another.  It can allow someone to check
out the project as it looked at any particular time.  It can also show
the differences, in diff format, between two states of the tree without
affecting someone's working copy.

Thus, the very features necessary to give convenient access to a
project's history are also useful for providing a decentralized,
uncoordinated developer team with the ability to collaborate on the
project.

For now, you can ignore the details of setting up a repository,
administering user access, and navigating CVS-specific file formats
(those will be covered in Note: Repository Administration).  For the
moment, we'll concentrate on how to make changes in a working copy.

But first, here is a quick review of terms:

   * "Revision" A committed change in the history of a file or set of
     files.  A revision is one "snapshot" in a constantly changing
     project.

   * "Repository" The master copy where CVS stores a project's full
     revision history.  Each project has exactly one repository.

   * "Working copy" The copy in which you actually make changes to a
     project.  There can be many working copies of a given project;
     generally each developer has his or her own copy.

   * "Check out" To request a working copy from the repository.  Your
     working copy reflects the state of the project as of the moment you
     checked it out; when you and other developers make changes, you
     must use commit and update to "publish" your changes and view
     others' changes.

   * "Commit" To send changes from your working copy into the central
     repository.  Also known as "check-in".

   * "Log message" A comment you attach to a revision when you commit
     it, describing the changes.  Others can page through the log
     messages to get a summary of what's been going on in a project.

   * "Update" To bring others' changes from the repository into your
     working copy and to show if your working copy has any uncommitted
     changes.  Be careful not to confuse this with commit; they are
     complementary operations.  Mnemonic: update brings your working
     copy up to date with the repository copy.

   * "Conflict" The situation when two developers try to commit changes
     to the same region of the same file.  CVS notices and points out
     conflicts, but the developers must resolve them.



automatically generated by info2www version 1.2.2.9